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the expressed toxins of the invention, treated to prolong the pesticidal activity when the 
substantially intact cells are applied to the environment of a target pest. The treated cell acts as 
a protective coating for the pesticidal toxin. The toxin becomes active upon ingestion by a target 
insect. 

Btifi f inscription of the Sequences 
SEQ ID NO. Ijs^a forward primer, designated "the 339 forward primer/' used 

according to the subject invention. 

SEQ D) NO. 2 is a reverse primer, designated "the 339 reverse primer," used according 

to the subject invention. 

SEQ ID NO. 3 is a nucleotide sequence encoding a toxin from B.t. strain PS 3 6 A. 

SEQ ID NO. 4 is an amino acid sequence for the 36A toxin. 

SEQ ID NO. 5 is a nucleotide sequence encoding a toxin from B.t. strain PS81F. 

SEQ ID NO. 6 is an amino acid sequence for the 81 F toxin. 

SEQ ID NO. 7 is a nucleotide sequence encoding a toxin from B.t, strain Javelin 1990. 

SEQ ID NO. 8 is an amino acid sequence for the Javelin 1990 toxin. 

SEQ ID NO. 9 is a forward primer, designated "158C2 PRIMER A," used according 

to the subject invention. 

SEQ ID NO. 10 is a nucleotide sequence encoding a portion of a soluble toxin from B.t. 

PS158C2. 

SEQ ID NO. 11 is a forward primer, designated "49C PRIMER A," used according to 

the subject invention. 

SEQ ID NO. 12 is a nucelotide sequence of a portion of a toxin gene from B.t. strain 

PS49C. 

SEQ ID NO. 13 is a forward pnmer, designated "49C PRIMER B," used according to 

the subject invention. 

SEQ ID NO. 14 is a reverse primer, designated "49C PRIMER C," used according to 

the subject invention. 

SEQ ID NO. 15 is an additional nucleotide sequence of a portion of a toxin gene from 

PS49C. 

SEQ ID NO. 16 is a forward primer used according to the subject invention. 
SEQ ID NO. 17 is a reverse primer used according to the subject invention. 
SEQ ID NO. 18 is a nucleotide sequence of a toxin gene from B.t strain PS10E1 . 
SEQ ID NO. 19 is an amino acid sequence from the 10E1 toxin. 
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SEQ ID NO. 20 is a nucleotide sequence of a toxin gene from B.t. strain PS31J2. 

SEQ ID NO. 21 is an amino acid sequence from the 3U2 toxin. 

SEQ ID NO. 22 is a nucleotide sequence of a toxin gene from B.t. strain PS33D2. 

SEQ ID NO. 23 is an amino acid sequence from the 33D2 toxin. 

SEQ ID NO. 24 is a nucleotide sequence of a toxin gene from BA. strain PS66D3 

SEQ ID NO. 25 is an amino acid sequence from the 66D3 toxin. 

SEQ ID NO! 26 is a nucleotide sequence of a toxin gene from B.i strain PS68F. 

SEQ ID NO. 27 is an amino acid sequence from the 68F toxin. 

SEQ ID NO. 28 is a nucleotide sequence of a toxin gene from BA. strain PS69AA2 

SEQ ID NO. 29 is an amino acid sequence from the 69AA2 toxin. 

SEQ ID NO. 30 is a nucleotide sequence of a toxin gene from Bt. strain PS168G1. 

SEQIDN0.31 is a nucleotide sequence of a MIS toxin gene from Bt. strain PS177C8. 

SEQ ID NO. 32 is an amino acid sequence from the 177C8-MIS toxin. 

SEQ ID NO. 33 is a nucleotide sequence of a toxin gene from B.t. strain PS17718 

SEQ ID NO. 34 is an amino acid sequence from the 17718 toxin. 

SEQ ID NO. 35 is a nucleotide sequence of a toxin gene from BA. strain PS1 85AA2. 

SEQ ID NO. 36 is an amino acid sequence from the 1 85AA2 toxin. 

SEQ ID NO. 37 is a nucleotide sequence of a toxin gene from B t. strain PS196F3. 

SEQ ID NO. 38 is an amino acid sequence from the 196F3 toxin. 

SEQ ID NO. 39 is a nucleotide sequence of a toxin gene from B.i strain PS196J4. 

SEQ ID NO. 40 is an amino acid sequence from the 196J4 toxin. 

SEQ ID NO. 41 is a nucleotide sequence ofatoxin gene from B.t. strain PS197T1. 

SEQ ID NO. 42 is an amino acid sequence from the 197T1 toxin. 

SEQ ID NO. 43 is a nucleotide sequence of a toxin gene from BA. strain PS197U2. 

SEQ ID NO. 44 is an amino acid sequence from the 197U2 toxin. 

SEQ ID NO. 45 is a nucleotide sequence of a toxin gene from BA. strain PS202E1 . 

SEQ ID NO. 46 is an amino acid sequence from the 202E1 toxin. 

SEQ ID NO. 47 is a nucleotide sequence of a toxin gene from BA. strain KB33. 

SEQ ID NO. 48 is a nucleotide sequence of a toxin gene from B.t. strain KB38. 

SEQ ID NO. 49 is a forward pnmer, designated "ICON- forward," used according to the 
subject invention. 

SEQ ID NO. 50 is a reverse primer, designated "ICON-re verse; 1 used according to the 
subject invention. 
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SEQ ID NO. 51 is a nucleotide sequence encoding a 177C8-WAR toxin gene from BA. 
strain PS177C8. 

SEQ ID NO. 52 is an amino acid sequence of a 177C8-WAR toxin from B.t. strain 
PS177C8. 

SEQ ID NO. 53 is a forward primer, designated "SUP-1A," used according to the 
subject invention. 

SEQ-ID NO. 54 is a reverse primer, designated U SUP- IB" used according to the subject 
invention. 

SEQ ID NOS. 55-110 are primers used according to the subject invention. 
SEQ ID NO. Ill is the reverse complement of the primer of SEQ ID NO. 58. 
SEQ ID NO. 112 is the reverse complement of the primer of SEQ ID NO. 60. 
SEQ ID NO. 113 is the reverse complement of the primer of SEQ ID NO. 64. 
SEQ ID NO. 114 is the reverse complement of the primer of SEQ ID NO. 66. 
SEQ ID NO. 115 is the reverse complement of the primer of SEQ ID NO. 68. 
SEQ ID NO. 116 is the reverse complement of the primer of SEQ ID NO. 70. 
SEQ ID NO. 117 is the reverse complement of the primer of SEQ ID NO. 72. 
SEQ ID NO. 118 is the reverse complement of the primer of SEQ ID NO. 76. 
SEQ ID NO. 1 19 is the reverse complement of the primer of SEQ ID NO. 78. 
SEQ ID NO. 120 is the reverse complement of the primer of SEQ ID NO. 80. 
SEQ ID NO. 121 is the reverse complement of the primer of SEQ ID NO. 82. 
SEQ ID NO. 122 is the reverse complement of the primer of SEQ ID NO. 84. 
SEQ ID NO. 123 is the reverse complement of the primer of SEQ ID NO. 86. 
SEQ ID NO. 124 is the reverse complement of the primer of SEQ ID NO. 88. 
SEQ ID NO. 125 is the reverse complement of the primer of SEQ ID NO. 92. 
SEQ ID NO. 126 is the reverse complement of the primer of SEQ ID NO. 94. 
SEQ ID NO. 127 is the reverse complement of the primer of SEQ ID NO. 96. 
SEQ ID NO. 128 is the reverse complement of the primer of SEQ ID NO. 98. 
SEQ ID NO. 129 is the reverse complement of the primer of SEQ ID NO. 99. 
SEQ ID NO. 130 is the reverse complement of the primer of SEQ ID NO. 100. 
SEQ ID NO. 131 is the reverse complement of the primer of SEQ ID NO. 104. 
SEQ ID NO. 132 is the reverse complement of the primer of SEQ ID NO. 106. 
SEQ ID NO. 133 is the reverse complement of the primer of SEQ ID NO. 108. 
SEQ ID NO. 134 is the reverse complement of the primer of SEQ ID NO. 1 10. 
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p arted Disclosure of the Invention 

The subject invention concerns materials and methods for the control of non-mammalian 
pests. In specific embodiments, the subject invention pertains to new Bacillus thuringiensis 
isolates and toxins wh.ch have activity against lepidopterans and/or coleopterans. The subject 
invention further concerns novel genes which encode pesticidal toxins and novel methods for 
identifying and characterizing.^d/to genes which encode tox.ns with useful propert.es. The 
subject mvention concerns not only the polynucleotide sequences which encode these toxins, but 
also the use of these polynucleotide sequences to produce recombinant hosts wh,ch express the 
toxins. The proteins of the subject invention are distinct from protein toxins which have 
previously been isolated from Bacillus thuringiensis. 

B.t. isolates useful according to the subject invention have been deposited in the 
permanent collecnon of the Agricultural Research Service Patent Culture Collection (NRRL), 
Northern Regional Research Center. 1815 North University Street, Peona. Illinois 61604, USA. 
The culture repository numbers of the B.t. strains are as follows: 



Culture 


Repository No. 


Deposit Date 


Patent No. 


B.t. PS11B(MT274) 


NRRL B-21556 


April 18, 1996 




B.t. PS24J 


NRRLB-18881 


August 30,1991 




B.t. PS31G1 (MT278) 


NRRL B-21560 


April 18, 1996 




B.t. PS36A 


NRRL B- 18929 


December 27, 1991 




B.t. PS33F2 


NRRL B- 18244 


July 28, 1987 j 


4,861,595 


B.t. PS40D1 


NRRLB-18300 


February 3, 1988 


5,098,705 


B.t. PS43F 


NRRL B-18298 


February 2, 1988 


4,996,155 


B.t. PS45B1 


NRRLB-18396 


August 16, 1988 


5,427,786 


B.t. PS49C 


NRRL B-2 1532 


March 14, 1996 




B.t. PS52A1 


NRRL B- 18245 


July 28, 1987 


4,861,595 


B.t. PS62B1 


NRRL B- 1 8398 


August 16, 1988 


4,849,217 


B.t. PS81A2 


NRRL B- 18484 


April 19, 1989 


5,164,180 


B.t. PS81F 


NRRL B-18424 


October 7, 1988 


5,045,469 


B.t. PS81GG 


NRRL B-l 8425 


October 11, 1988 


5,169,629 


B.t. PS81I 


NRRL B-l 8484 


April 19, 1989 ~ 


5,126,133 


B.t. PS85A1 


NRRL B-l 8426 


October 11, 1988 




B.t. PS86A1 


NRRL B-l 8400 


August 16, 1988 


4,849,217 


B.t. PS86B1 


NRRL B-l 8299 


February 2, 1988 


4,966,765 


B.t. PS86BB1 


NRRL B-21 557 


April 18, 1996 




(MT275) 
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Culture 


Repository No. 


Deposit Date 


Patent No, 


B.L PS86Q3 


NRRLB-18765 


February 6 J 991 


5,208,017 


B.L PS86V1 (MT276) 


NRRLB-21558 


April 18,1996 




B.L PS86W1 
(MT277) 


NRRLB-21559 


April 18,1996 




B.L PS89J3 (MT279) 


NRRLB-21561 


April 18, 1996 




B.L PS91C2 


NRRLB-18931 


February 6, 1991 




B.L PS92B 


NRRLB-18889" 


September 23, 1991 


5,427,786 


Al. PS101Z2 


NRRLB-18890 


October 1,1991 


5,427,786 


PS122D3 


NRRLB-18376 


June 9, 1988 


5,006,336 


B.L PS123D1 


NRRLB-21011 


October 13, 1992 


5,508,032 


B.L PS157C1 
(MT104) 


NRRLB- 18240 


July 17, 1987 


5,262,159 


B.L PS158C2 


NRRLB-18872 


August 27, 1991 


5,268,172 


PS169E 


NRRLB-18682 


July 17, 1990 


5,151,363 


B.r. PS177F1 


NRRLB-18683 


July 17, 1990 


5,151,363 


B.L PS177G 


NRRLB- 18684 


July 17, 1990 


5,151,363 


5./. PS185L2 


NRRLB-21535 


March 14, 1996 




B.L PS185U2 


NRRLB-21562 


April 18, 1996 




R t PC1Q7M4 


NRRLB- 18932 


December 27, 1991 


5,273,746 | 


d f PS.701L1 


NRRLB- 18749 


January 9, 1991 


5,298,245 


/? / PS204C3 


NRRLB-21008 


October 6, 1992 




» / PS204G4 


NRRLB- 18685 


July 17, 1990 


5,262,399 1 


/ PS242H10 

£).i. X 1 1 « 


NRRLB-21439 


March 14, 1996 




d , pq?4?K17 


NRRLB-21540 


March 14, 1996 




d < PC9A4A? 


NRRLB-21541 


March 14, 1996 




/? y PC744D1 


NRRLB-21542 


March 14, 1996 




D # PCI API 


NRRLB-21862 


October 24, 1997 




5./.PS31F2 


NRRLB-21876 


October 24, 1997 




B.L PS31J2 


NRRLB-21009 


October 13, 1992 




5./. PS33D2 


NRRLB-21870 


October 24, 1997 




£./. PS66D3 


NRRLB-21858 


October 24, 1997 




B.L PS68F 


NRRLB-21857 


J October 24, 1997 




57. PS69AA2 


NRRLB-21859 


October 24, 1997 




B.L PS146D 


NRRLB-21866 


October 24, 1997 




PS168G1 


NRRLB-21873 


October 24, 1997 




fl./. PS 17514 ' 


NRRLB-21865 


October 24, 1997 
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acknowledges the duty to replace the depos.t(s) should the depository be unable to furnish a 
sample when requested, due to the condition of a deposit. All restrictions on the availability to 
the public of the subject culture deposits will be irrevocably removed upon the granting of a 

patent disclosing them. 

Many of the strains useful according to the subject invention are readily available.by 
virtue of the issuance of patents disclosing these strains or by their deposit in public collections 
or by their inclusion in commercial products. For example, the B.t. strain used in the 
commercial product, Javelin, and the HD isolates are all publicly available. 

Mutants of the isolates referred to herein can be made by procedures well known in the 
art. For example, an asporogenous mutant can be obtained through ethylmethane sulfonate 
(EMS) mutagenesis of an isolate. The mutants can be made using ultraviolet light and 
nitrosoguanidine by procedures well known in the art. 

In one embodiment, the subject invention concerns materials and methods including 
nucleotide primers and probes for isolating, characterizing, and identifying Bacillus genes 
encoding protein toxins which are active against non-mammalian pests. The nucleotide 
sequences described herein can also be used to identify new pesticidal Bacillus isolates. The 
mvention further concerns the genes, isolates, and toxins identified using the methods and 

materials disclosed herein. 

The new toxins and polynucleotide sequences provided here are defined according to 
several parameters. One characteristic of the toxins described herein is pesticidal activity. In 
a specific embodiment, these toxins have activity against coleopteran and/or lepidopteran pests. 
The toxins and genes of the subject invention can be further defined by their amino acid and 
nucleotide sequences. The sequences of the molecules can be defined in terms of homology to 
certain exemplified sequences as well as in terms of the ability to hybridize with, or be amplified 
by, certain exemplified probes and primers. The toxins provided herein can also be identified 
based on their immunoreactivity with certain antibodies. 

An important aspect of the subject invention is the identification and characterization 
of new families of Bacillus toxins, and genes which encode these toxins. These families have 
been designated MIS-1, MJS-2, MIS-3, MIS-4. MIS-5, MIS-6, WAR-1, and SUP-1. Toxins 
within these families, as well as genes encoding toxins within these families, can readily be 
identified as described herein by, for example, size, amino acid or DNA sequence, and antibody 
reactivity. Amino acid and DNA sequence characteristics include homology with exemplified 
sequences, ability to hybridize with DNA probes, and ability to be amplified with specific 
primers. 
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The MIS-1 family of toxins includes toxins from isolate PS68F. Also provided are 
hybridization probes and PCR primers which specifically identify genes falling in the MIS-l 
family. 

A second family of toxins identified herein is the M1S-2 family. Th.s family includes 
5 toxins which can be obtained from isolates PS66D3, PS197T1, and PS31J2. The subject 
invention further provides probes and primers for the identification of MIS-2 toxins and genes. 

A third faimly of toxins identified heron is the MIS-3 family. This family includes 
toxins which can be obtained from B.t. isolates PS69AA2 and PS33D2. The subject invention 
further provides probes and primers for identification of the MIS-3 genes and toxins. 
10 Polynucleotide sequences encoding MIS^l toxins can be obtained from the B.t. isolate 

designated PS197U2. The subject invention further provides probes and primers for the 
identification of genes and toxins in this family. 

A fifth family of toxins identified herein is the MIS-5 family. This family includes 
toxins which can be obtained from B.t. isolates KB33 and KB38. The subject invention further 
15 provides probes and primers for identification of the MIS-5 genes and toxins. 

A sixth family of toxins identified herein is the MIS-6 family. This family includes 
toxins wh,ch can be obtained from B.t. isolates PS196F3, PS168G1, PS196J4, PS202E1, 
PS10E1, and PS185AA2. The subject invention further provides probes and primers for 
identification of the MIS-6 genes and toxins. 
20 In a preferred embodiment, the genes of the MIS family encode toxins having a 

molecular weight of about 70 to about 100 kDa and, most preferably, the toxins have a s,ze of 
about 80 kDa. Typically, these toxins are soluble and can be obtained from the supernatant of 
Bacillus cultures as described herein. These toxins have toxicity against non-mammalian pests. 
In a preferred embodiment, these toxins have activity against colcopteran pests. The MIS 
25 proteins are further useful due to their ability to form pores in cells. These proteins can be used 

with second entities including, for example, other proteins. When used with a second entity, the 
MIS protein w.ll facilitate entry of the second agent into a target cell. In a preferred 
embodiment, the MIS protein interacts with MIS receptors in a target cell and causes pore 
formation in the target cell. The second entity may be a toxin or another molecule whose entry 

30 into the cell is desired. 

The subject invention further concerns a family of toxins designated WAR-1. The 
WAR- 1 toxins typically have a size of about 30-50 kDa and, most typically, have a size of about 
40 kDa. Typically, these toxins are soluble and can be obtained from the supernatant of Bacillus 
cultures as described herein. The WAR- 1 toxins can be identified with primers described herein 
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as well as with antibodies. In a specific embodiment, the antibodies can be raised to, for 
example, toxin from isolate PS177C8. 

An additional family of toxins provided according to the subject invention are the toxins 
designated SUP-1. Typically, these toxins are soluble and can be obtained from the supernatant 

5 of Bacillus cultures as described herein. In a preferred embodiment, the SUP- 1 toxins are active 

against lepidopteran pests. The SUP-1 toxins typically have a size of about 70-100 kDa and, 
preferably, about 80 kDa. The SUP-1 family is exemplified herein by toxins from isolates 
PS49C and PS158C2. The subject invention provides probes and primers useful for the 
identification of toxins and genes in the SUP-1 family 

1 0 The subject invention further provides specific Bacillus toxins and genes which did not 

fall into any of the new families disclosed herein. These specific toxins and genes include toxins 
and genes which can be obtained from PS177C8 and PS177I8. 

Toxins in the MIS, WAR, and SUP families are all soluble and can be obtained as 
described herein from the supernatant of Bacillus cultures. These toxins can be used alone or 

1 5 in combination with other toxins to control pests. For example, toxins from the MIS families 

may be used in conjunction with WAR-type toxins to achieve control of pests, particularly 
coleopteran pests. These toxins may be used, for example, with 6-endotoxins which are 
obtained from Bacillus isolates. 

Table 1 provides a summary of the novel families of toxins and genes of the subject 

20 invention. Each of the six MIS families is specifically exemplified herein by toxins which can 
be obtained from particular B.t. isolates as shown in Table 1 . Genes encoding toxins in each of 
these families can be identified by a variety of hi ghly specific parameters, including the ability 
to hybridize with the particular probes set forth in Table 1 . Sequence identity in excess of about 
80% with the probes set forth in Table 1 can also be used to identify the genes of the various 

25 families. Also exemplified are particular primer pairs which can be used to amplify the genes 

of the subject invention. A portion of a gene within the indicated families would typically be 
amplifiable with at least one of the enumerated primer pairs. In a preferred embodiment, the 
amplified portion would be of approximately the indicated fragment size. Primers shown in 
Table 1 consist of polynucleotide sequences which encode peptides as shown in the sequence 

30 listing attached hereto. Additional primers and probes can readily be constructed by those 

skilled in the art such that alternate polynucleotide sequences encoding the same amino acid 
sequences can be used to identify and/or characterize additional genes encoding pesticidal 
toxins. In a preferred embodiment, these additional toxins, and their genes, could be obtained 
from Bacillus isolates. 
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Table 1. 



r amity 


Isolates 


Probes 


Primer Pairs 


Fragment size 






(SEQIDNO.) 


(SEQ1DNOS.) 


(nt) 


MIS-1 


PS68F 


26 


56 and 111 


69 








56 and 112 


506 








58 and 112 


458 ^ 


MIS-2 


PS66D3,PS197T1,PS31J2 


- 24,41,20 


62 and 113 


160 


. 


— ■ 




62 and 114 


239 








62 and 115 


400 








62 and 116 


509 








62 and 117 


703 








64 and 114 


102 








64 and 115 


263 








64 and 116 


372 








64 and 117 


566 








66 and 115 


191 








66 and 116 


300 








66 and 117 


494 








68 and 116 


131 








68 and 117 


325 








70 and 117 


213 


M1S-3 


PS69AA2, PS33D2 


28, 22 


74 and 118 


141 








74 and 119 


376 








74 and 120 


389 








74 and 121 


483 








74 and 122 


715 








74 and 123 


743 








74 and 124 


902 








76 and 119 


253 








76 and 120 


266 








76 and 121 


360 








76 and 122 


592 








76 and 123 


620 








76 and 124 


779 








78 and 120 


31 








78 and 121 


125 








78 and 122 


357 








78 and 123 


385 








78 and 124 


544 








80 and 121 


116 








80 and 122 


348 
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10 




15 



MIS-5 



KB33.KB38 



47 t 48 



20 



25 



30 



379 
504 




35 



97 and 128 
97 and 129 

97 and 130 

98 and 129 

98 and 130 

99 and 130 
102 and 131 
102 and 132 

102 and 133 
102 and 134 
104 and 132 
104 and 133 
104 and 134 
106 and 133 
106 and 134 
1 08 and 134 
53 and 54 



resrt cuo» enzyn* . described „, to - *k »— 0 228 W * ^ ^ 



754 
213 
199 
708 
31 
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nL«. H R Whiuekv (1990) / «ri am 265:20923.20930, H-«, 0., D. Con.nnU. ). Van 

10 application. and ^ 

With the teachings provided herein, one stoneo 
the various toxins and polynucleotide sequences described herein. 

fip^** T* genes and toxms useful according to the sub.ct invent^ 

includ e^r^^ «*— but aiso of thesc seqam Z : 

A . „ oroteins which retain the characteristic pesticidal activity of the toxins 

from more than on ^ of genes refer to 

nucleotide sequen ^ ^ refers t0 toxins havuig the same 

„ »U ,o a puree, *■* * * - - — — " * 

K, d ton, Ore isnlaues aenosW « > «*« depository » ^ 

iron, " J ^ „ constnleKd ^.hurionll,, for I* - of . 

available exonucleases or endonucleases according to stan P 

h as A*l or site-directed mutagenesis can be used to systematically cut off 
enzymes such as A*l or site fragments may be 

30 ^^*- rf r^^^-.*^----^ 

obtainedusingavarietyofrestnction enzymes. Proteasesmay 

Bacillus isolates and/or DNA libraries using the teachings provided herein. There arc a number 
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of methods for obtaining the pesticidal toxins of the instant invention. For example, antibodies 
to the pesticidal toxins disclosed and claimed herein can be used to identify and isolate toxins 
from a mixture of proteins. Specifically, antibodies may be raised to the portions of the toxins 
which are most constant and most distinct from other Bacillus toxins. These antibodies can then 
5 be used to specifically identify' equivalent toxins with the characteristic activity by 
immunoprecipitation, enzyme linked immunosorbent assay (ELISA), or Western blotting. 
Antibodies to the toxins disclosed herein, or to equivalent toxins, or fragments of these toxins, 
can readily be prepared using standard procedures in this art. The genes which encode these 
toxins can then be obtained from the microorganism. 

1 o Fragments and equivalents which retain the pesticidal activity of the exemplified toxins 

are within the scope of the subject invention. Also, because of the redundancy of the genetic 
code, a variety of different DNA sequences can encode the amino acid sequences disclosed 
herein. It is well within the skill of a person trained in the art to create these alternative DNA 
sequences encoding the same, or essentially the same, toxins. These variant DNA sequences are 

15 within the scope of the subject invention. As used herein, reference to "essentially the same" 
sequence refers to sequences which have amino acid substitutions, deletions, additions, or 
insertions which_do not materially affect pesticidal activity. Fragments retaining pesticidal 
activity are also included in this definition. 

A further method for identifying the toxins and genes of the subject invention is through 

20 the use of oligonucleotide probes. These probes are detectable nucleotide sequences. Probes 

provide a rapid method for identifying toxin-encoding genes of the subject invention. The 
nucleotide segments which are used as probes according to the invention can be synthesized 
using a DNA synthesizer and standard procedures. 

Certain toxins of the subject invention have been specifically exemplified herein. Since 

25 these toxins are merely exemplary of the toxins of the subject invention, it should be readily 

apparent that the subject invention comprises variant or equivalent toxins (and nucleotide 
sequences coding for equivalent toxins) having the same or similar pesticidal activity of the 
exemplified toxin. Equivalent toxins will have amino acid homology with an exemplified toxin. 
This amino acid identity will typically be greater than 60%, preferably be greater than 75%, 

30 more preferably greater than 80%, more preferably greater than 90%, and can be greater than 

95%. These identities are as determined using standard alignment techniques. The amino acid 
homology will be highest in critical regions of the toxin which account for biological activity 
or are involved in the determination of three-dimensional configuration which ultimately is 
responsible for the biological activity. In this regard, certain amino acid substitutions are 
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acceptable and can be expected if these substitutions are in regions which are not critical to 
activity or are conservative amino acid substitutions which do not affect the three-dimensional 
configuration of the molecule. For example, amino acids may be placed in the following 
classes: non-polar, uncharged polar, basic, and acidic. Conservative substitutions whereby an 
amino acid of one class is replaced with another amino acid of the same type fall within the 
scope of the subject invention so long as the substitution does not materially alter the biological 
activity of the compound.. Table 2 provides a listing of examples of amino acidsbdonging to 
each class. 



Table 2. 



Class of Amino Acid 


Examples of Amino Acids 


Nonpolar 


Ala, Val, Leu, He, Pro, Met, Phe, Trp 


Uncharged Polar 


Gly, Ser, Thr, Cys, Tyr, Asn, Gin 


Acidic 


Asp, Glu 


Basic 


Lys, Arg, His 



In some instances, non-conservative substitutions can also be made. The critical factor 
is that these substitutions must not significantly detract from the biological activity of the toxin. 

The 6-endotoxins of the subject invention can also be characterized in terms of the shape 
and location of toxin inclusions, which are described above. 

As used herein, reference to "isolated" polynucleotides and/or "purified" toxins refers 
to these molecules when they are not associated with the other molecules with which they would 
be found in nature. Thus, reference to "isolated and purified" signifies the involvement of the 
"hand of man" as described herein. Chimeric toxins and genes also involve the "hand of man." 

p^P ^hinant hosts . The toxin-encoding genes of the subject invention can be 
introduced into a wide variety of microbial or plant hosts. Expression of the toxin gene results, 
directly or indirectly, in the production and maintenance of the pesticide. With suitable 
microbial hosts, e.g., Pseudomonas, the microbes can be applied to the situs of the pest, where 
they will proliferate and be ingested. The result is a control of the pest. Alternatively, the 
microbe hosting the toxin gene can be killed and treated under conditions that prolong the 
activity of the toxin and stabilize the cell. The treated cell, which retains the toxic activity, then 
can be applied to the environment of the target pest. 
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Where the Bacillus toxin gene is introduced via a suitable vector into a microbial host, 
and said host is applied to the environment in a living state, it is essential that certain host 
microbes be used. Microorganism hosts are selected- which are known to occupy the 
"phytosphere" (phylloplane, phyllosphere, rhizosphere, and/or rhizoplane) of one or more crops 
5 of interest. These microorganisms are selected so as to be capable of successfully competing 

in the particular environment (crop and other insect habitats) with the wild-type microorganisms, 
provide for stable maintenance and expression of the gene expressing the polypeptide pesticide, 
and, desirably, provide for improved protection of the pesticide from environmental degradation 
and inactivation. 

1 o A large number of microorganisms are known to inhabit the phylloplane (the surface of 

the plant leaves) and/or the rhizosphere (the soil surrounding plant roots) of a wide variety of 
important crops. These microorganisms include bacteria, algae, and fungi. Of particular interest 
are microorganisms, such as bacteria, e.g., genera Pseudomonas, Erwinia, Serratia, Klebsiella, 
Xanthomonas, Streptomyces, Rhizobium, Rhodopseudomonas, Methylophilius, Agrobacterium, 

1 5 Acetobacter, Lactobacillus, Arthrobacter, Azotobacter, Leuconostoc, and Alcaligenes; fungi, 

particularly yeast, e.g. , genera Saccharomyces, Cryptococcus, Kluyveromyces, Sporobolomyces, 
Rhodotorula, and Aureobasidium. Of particular interest are such phytosphere bacterial species 
as Pseudomonas syringae, Pseudomonas fluoresces, Serratia marcescens, Acetobacter xylinum, 
Agrobacterium tumefaciens, Rhodopseudomonas spheroides, Xanthomonas campestris, 

20 Rhizobium melioti, Alcaligenes entrophus, and Azotobacter vinlandii; and phytosphere yeast 

species such as Rhodotorula rubra, R* glutinis, R. marina, R. aurantiaca, Cryptococcus albidus, 
C. dijjluens, C laurentii, Saccharomyces rosei, S. pretoriensis, S. cerevisiae, Sporobolomyces 
roseus, S. odorus, Kluyveromyces veronae, and Aureobasidium pollulans. Of particular interest 
are the pigmented microorganisms. 

25 A wide variety of ways are available for introducing a Bacillus gene encoding a toxin 

into a microorganism host under conditions which allow for stable maintenance and expression 
of the gene. These methods are well known to those skilled in the art and are described, for 
example, in United States Patent No. 5,135,867, which is incorporated herein by reference. 

Synthetic genes which are functionally equivalent to the toxins of the subject invention 

30 can also be used to transform hosts. Methods for the production of synthetic genes can be found 

in, for example, U.S. Patent No. 5,380,831. 

Treatment of cells . As mentioned above, Bacillus or recombinant cells expressing a 
Bacillus toxin can be treated to prolong the toxin activity and stabilize the cell. The pesticide 
microcapsule that is formed comprises the Bacillus toxin within a cellular structure that has been 
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stabilized and will protect the toxin when the microcapsule is applied to the environment of the 
target pest. Suitable host cells may include either prokaryotes or eukaryotes. As hosts, of 
particular interest will be the prokaryotes and the lower eukaryotes, such as fungi. The ceil will 
usually be intact and be substantially in the proliferative form when treated, rather than in a 
5 spore form. 

Treatment of the microbial cell, e.g., a microbe containing the Bacillus toxin gene, can 
be by chemical or physical means, or by a combination of chemical and/or physical means, so 
long as the technique does not deleteriously affect the properties of the toxin, nor diminish the 
cellular capability of protecting the toxin. Methods for treatment of microbial cells are disclosed 
10 in United States Patent Nos. 4,695,455 and 4,695,462, which are incorporated herein by 

reference. 

y rthnds and formulations for control of pests . Control of pests using the isolates, toxins, 
and genes of the subject invention can be accomplished by a variety of methods known to those 
skilled in the art. These methods include, for example, the application of Bacillus isolates to the 

1 5 pests (or their location), the application of recombinant microbes to the pests (or their locations), 

and the transformation of plants with genes which encode the pesticidal toxins of the subject 
invention. Transformations can be made by those skilled in the art using standard techniques. 
Materials necessary for these transformations are disclosed herein or are otherwise readily 
available to the skilled artisan. 

20 Formulated bait granules containing an attractant and the toxins of the Bacillus isolates, 

or recombinant microbes comprising the genes obtainable from the Bacillus isolates disclosed 
herein, can be applied to the soil. Formulated product can also be applied as a seed-coating or 
root treatment or total plant treatment at later stages of the crop cycle. Plant and soil treatments 
of Bacillus cells may be employed as wettable powders, granules or dusts, by mixing with 

25 various inert materials, such as inorganic minerals (phyllosilicates, carbonates, sulfates, 

phosphates, and the like) or botanical materials (powdered corncobs, rice hulls, walnut shells, 
and the like). The formulations may include spreader-sticker adjuvants, stabilizing agents, other 
pesticidal additives, or surfactants. Liquid formulations may be aqueous-based or non-aqueous 
and employed as foams, gels, suspensions, emulsifiable concentrates, or the like. The 

30 ingredients may include Theological agents, surfactants, emulsifiers, dispersants, or polymers. 

As would be appreciated by a person skilled in the art, the pesticidal concentration will 
vary widely depending upon the nature of the particular formulation, particularly whether it is 
a concentrate or to be used directly. The pesticide will be present in at least 1% by weight and 
may be 100% by weight. The dry formulations will have from about 1-95% by weight of the 
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pesticide while the liquid formulations will generally be from about 1 -60% by we.ght of the 
solids in the liquid phase. The formulations that contain cells will generally have from about 
10' to about 10* cells/mg. These formulations will be administered at about 50 mg (liquid or 

dry) to 1 kg or more per hectare. 
5 The formulations can be applied to the environment of the pest, e.g, soil and foliage, 

by spraying, dusting, sprinkling, or the like. 

^ ^nrVntide trrobes . It is well known that DNA possesses a fundamental property 
called base complementarity. In nature, DNA ordinarily exists in the form of pa.rs of anti- 
parallel strands, the bases on each strand projecting from that strand toward the opposite strand. 
10 The base adenine (A) on one strand will always be opposed to the base thymine (T) on the other 
strand, and the base guanine (G) will be opposed to the base cytosine (C). The bases are held 
in apposition by their ability to hydrogen bond in this specific way. Though each individual 
bond is relatively weak, the net effect of many adjacent hydrogen bonded bases, together with 
base stacking effects, is a stable joining of the two complementary strands. These bonds can be 
,5 broken by treatments such as high pH or high temperature, and these Conditions result in the 
dissociation, or "denaturation." of the two strands. If the DNA is then placed in conditions 
wh.ch make hydrogen bonding of the bases thermodynam,cally favorable, the DNA strands will 
anneal, or "hybridize," and reform the original double stranded DNA. If carried out under 
appropriate conditions, this hybridization can be highly specific. That is, only strands with a 
20 high degree of base complementarity will be able to form stable double stranded structures. The 
relationship of the specificity of hybridization to reaction conditions is well known. Thus, 
hybridization may be used to test whether two pieces of DNA are complementary in their base 
sequences. It is this hybridization mechanism which facilitates the use ofprobes of the subject 
invention to readily detect and characterize DNA sequences of interest. 
25 The probes may be RNA or DNA. The probe will normally have at least about lObases, 

more usually at least about 17 bases, and may have up .to about 100 bases or more. Longer 
probes can readily be utilized, and such probes can be, for example, several kilobases in length. 
The probe sequence is designed to be at least substantially complementary to a portion of a gene 
encoding a toxin of interest The probe need not have perfect complementarity to the sequence 
30 to which it hybridizes. The probes may be labelled utilizing techniques which are well known 

to those skilled in this art. 

One approach for the use of the subject mention as probes entails first identifying by 
Southern blot analysis of a gene bank of the Bacillus isolate all DNA segments homologous with 
the disclosed nucleotide sequences. Thus, it is possible, without the aid of biological analysis, 
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to know in advance the probable activity of many new Bacillus isolates, and of the individual 
gene products expressed by a given Bacillus isolate. Such a probe analysis provides a rapid 
method for identifying potentially commercially valuable msecticidal toxin genes within the 

multifarious subspecies of B.t. 

5 One hybridization procedure useful according to the subject invention typically includes 

the initial steps of isolating the DNA sample of interest andpurifying it chemically. Either lysed 
bacteria or total fractionated nucleic acidjsplated from bactena can be used. Cells can be treated - 
using known techniques to liberate their DNA (and/or RNA). Tbe DNA sample can be cut into 
pieces with an appropriate restriction enzyme. Tbe pieces can be separated by size through 

10 electrophoresis in a gel, usually agarose or acrylamide. Tbe pieces of .merest can be transferred 

to an immobilizing membrane. 

The particular hybridization technique is not essential to the subject invention. As 
improvements are made in hybridization techniques, they can be readily applied. 

Tbe probe and sample can then be combined in a hybridization buffer solution and held 
15 at an appropriate temperature until annealing occurs. Thereafter, the membrane is washed free 

of extraneous materials, leaving the sample and bound probe molecules typically detected and 
quantified by autoradiography and/or liquid scintillation counting. As is well known in the art, 
,f the probe molecule and nucleic acid sample hybndize by forming a strong non-covalent bond 
between the two molecules, it can be reasonably assumed that the probe and sample are 
20 essentially identical. Tbe probe's detectable label provides a means for determmmg in a known 
manner whether hybridization has occurred. 

In the use of the nucleotide segments as probes, the particular probe is labeled with any 
suitable label known to those skilled in the art, including radioactive and non-radioactive labels. 
Typical radioactive labels include »P, »S, or the like. Non-rad.oactive labels include, for. 
25 example, ligands such as biotin or thyroxine, as well as enzymes such as hydrolases or 
penxodases, or the various chemiluminescers such as luciferin, or fluorescent compounds like 
fluorescein and its derivatives. The probes may be made inherently fluorescent as desenbed in 
International Application No. WO 93/16094. 

Various degrees of stringency of hybridization can be employed. The more severe the 

30 condihons.thegrtaterthec^ SeVCntyCan 
be controlled by temperature, probe concentration, probe length, ionic strength, Ume, and the 
like Preferably, hybridization is conducted under moderate to high stringency conditions by 
techniques well known in the art. as described, forexample. in Kel.er. G.H.. MM. Manak (1987) 
DNA Probes, Stockton Press, New York, NY., pp. 169-170. 
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As used herein "moderate to high stringency" conditions for hybridization refers to 
conditions which achieve the same, or about the same, degree of specificity of hybridization as 
the conditions employed by the cuiTent applicants. Examples of moderate and high stringency 
conditions are provided herein. Specifically, hybridization of immobilized DNA on Southern 
5 blots with 32P-labeled gene-specific probes was performed by standard methods (Maniatis et 
ai). In general, hybridization and subsequent washes were carried out under moderate to high 
stringency conditions that allowed for detection of target sequences withjiomology to the 
exemplified toxin genes. For double-stranded DNA gene probes, hybridization was carried out 
overnight at 20-25° C below the melting temperature (Tm) of the DNA hybrid in 6X SSPE, 5X 
10 Denhardt's solution, 0.1% SDS, 0.1 mg/ml denatured DNA. The melting temperature is 

described by the following formula (Beltz, G.A., K.A. Jacobs, T.H. Eickbush, P.T. Cherbas, and 
F.C Kafatos [1983] Methods of Enzymology, R. Wu, L. Grossman and K. Moldave [eds.] 
Academic Press, New York 1 00:266-285). 

Tm=81.5°C+16.6 lx>g[Na+H.41(%G^ of duplex in 

15 base pairs. 

Washes are typically carried out as follows: 

(1) Twice at room temperature for 15 minutes in IX SSPE, 0.1% SDS (low 
stringency wash). 

(2) Once at Tm-20°C for 15 minutes in 0.2X SSPE, 0.1% SDS (moderate 
20 stringency wash). 

For oligonucleotide probes, hybridization was carried out overnight at 10-20°C below 
the melting temperature (Tm) of the hybrid in 6X SSPE, 5X Denhardt's solution, 0. 1 % SDS, 0. 1 
mg/ml denatured DNA. Tm for oligonucleotide probes was determined by the following 
formula: 

25 Tm (°C)=2(number T/A base pairs) +4(number G/C base pairs) (Suggs, S.V., T. 

Miyake, E.H. Kawashime, M J. Johnson, K. Itakura, and KB. Wallace [198 1] ICN-UCLA Symp. 
Dev. Biol. Using Purified Genes, D.D. Brown [ed.], Academic Press, New York, 23:683-693). 
Washes were typically carried out as follows: 

( 1 ) Twice at room temperature for 1 5 minutes 1 X SSPE, 0.1% SDS (low stringency 

30 wash). 

(2) Once at the hybridization temperature for 15 minutes in IX SSPE, 0.1% SDS 
(moderate stringency wash). 

In general, salt and/or temperature can be altered to change stringency. With a labeled 
DNA fragment >70 or so bases in length, the following conditions can be used: 
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Low: 1 or 2X SSPE, room temperature 

Low: 1 or2XSSPE,42°C 

Moderate: 0.2X or 1 X SSPE, 65 °C 
High: 0.1XSSPE,65°C. 
5 Duplex formation and stability depend on substantial complementarity between the two 

strands of a hybrid, and, as noted above, a certain degree of mismatch can be tolerated. 
Therefore, the probe sequences of the subject invention include mutations (both single and 
multiple), deletions, insertions of the described sequences, and combinations thereof, wherein 
said mutations, insertions and deletions permit formation of stable hybrids with the target 
10 polynucleotide of interest. Mutations, insertions, and deletions can be produced in a given 
polynucleotide sequence in many ways, and these methods are known to an ordinarily skilled 
artisan. Other methods may become known in the future. 

Thus, mutational, insertional, and deletional variants of the disclosed nucleotide 
sequences can be readily prepared by methods which are well known to those skilled in the art. 
1 5 These variants can be used in the same manner as the exemplified primer sequences so long as 
the variants have substantial sequence homology with the original sequence. As used herein, 
substantial sequence homology refers to homology which is sufficient to enable the variant 
probe to function in the same capacity as the original probe. Preferably, this homology is greater 
than 50%; more preferably, this homology is greater than 75%; and most preferably, this 
20 homology is greater than 90%. The degree of homology needed for the variant to function in 
its intended capacity will depend upon the intended use of the sequence. It is well within the 
skill of a person trained in this art to make mutational, insertional, and deletional mutations 
which are designed to improve the function of the sequence or otherwise provide a 

methodological advantage. 

25 ppr technology . Polymerase Chain Reaction (PCR) is a repetitive, enzymatic, primed 

synthesis of a nucleic acid sequence. This procedure is well known and commonly used by 
those skilled in this art (see Mullis, U.S. Patent Nos. 4,683,195. 4,683,202, and 4,800,159; Saiki, 
Randall K., Stephen Scharf, Fred Faloona, Kary B. Mullis, Glenn T. Horn, Henry A. Erlich, 
Norman Amheim [1985] "Enzymatic Amplification of P-Globin Genomic Sequences and 

30 Restriction Site Analysis for Diagnosis of Sickle Cell Anemia," Science 230:1350-1354.). PCR 

is based on the enzymatic amplification of a DNA fragment of interest that is flanked by two 
oligonucleotide primers that hybridize to opposite strands of the target sequence. The primers 
are onented with the 3 ' ends pointing towards each other. Repeated cycles of heat denaturation 
of the template, annealing of the printers to their complementary sequences, and extension of 
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the annealed pnmers with a DNA polymerase result in the amplification of the segment defined 
by the 5' ends of the PCR primers. Since the extension product of each primer can serve as a 
template for the other primer, each cycle essentially doubles the amount of DNA fragment 
produced in the previous cycle. This results in the exponential accumulation of the specific 

5 target fragment, up to several million-fold in a few hours. By using a thermostable DNA 
polymerase such as Taq polymerase, which is isolated from the thermophilic bacterium Thermus 
aquaticus, the amplification process can be completely automated. Other en2ymes which can 
be used are known to those skilled in the art. 

The DNA sequences of the subject invention can be used as primers for PCR 

10 amplification. In performing PCR amplification, a certain degree of mismatch can be tolerated 
between primer and template. Therefore, mutations, deletions, and insertions (especially 
additions of nucleotides to the 5' end) of the exemplified primers fall within the scope of the 
subject invention. Mutations, insertions and deletions can be produced in a given primer by 
methods known to an ordinarily skilled artisan. 

15 All of the U.S. patents cited herein are hereby incorporated by reference. 

Following are examples which illustrate procedures for practicing the invention. These 
examples should not be construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 

Emn pk 1 =^ilturinfc g £ BocilhiilssMsLUssM Accordinp to the I nvent i on 

growth of cells . The cellular host containing the Bacillus insecticidal gene may be 
grown in any convenient nutrient medium. These cells may then be harvested in accordance 
with conventional ways. Alternatively, the cells can be treated prior to harvesting. 

The Bacillus cells of the invention can be cultured using standard art media and 
fermentation techniques. During the fermentation cycle, the bacteria can be harvested by first 
separating the Bacillus vegetative cells, spores, crystals, and lysed cellular debris from the 
fermentation broth by means well known in the art. Any Bacillus spores or crystal ^endotoxins 
formed can be recovered employing well-known techniques and used as a conventional 6- 
endotoxin B.t. preparation. The supernatant from the fermentation process contains the toxins 
of the present invention. The toxins are isolated and purified employing well-known techniques. 

A subculture of Bacillus isolates, or mutants thereof, can be used to inoculate the 
following medium, known as TB broth: 



20 
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Tryptone 12 g/1 

Yeast Extract 24 g/1 

Glycerol 4 g/1 

KH 2 P0 4 2.1 g/1 

5 K 2 HP0 4 14.7 g/1 

pH 7.4 

The potassium phosphate was added to the autoclaved broth after cooling. Flasks were 
incubated at 30°C on a rotary shaker at 250 rpm for 24-36 hours. 
] o The above procedure can be readily scaled up to large fermentors by procedures well 

known in the art. 

The Bacillus obtained in the above fermentation, can be isolated by procedures well 
known in the art. A frequently-used procedure is to subject the harvested fermentation broth to 
separation techniques, e.g., centrifugation. In a specific embodiment, Bacillus proteins useful 
1 5 according the present invention can be obtained from the supernatant. The culture supernatant 

containing the active protein(s) can be used in bioassays. 

Alternatively, a subculture of Bacillus isolates, or mutants thereof, can be used to 
inoculate the following peptone, glucose, salts medium: 
Bacto Peptone 7.5 g/1 

20 Glucose 

KH,P0 4 3.4 g/1 

K 2 HP0 4 4 -35 g/1 

Salt Solution 5.0 ml/1 

CaCl 2 Solution 5.0 ml/3 

25 PH7.2 

Salts Solution (100 ml) 

MgS0 4 '7H 2 0 2.46 g 

MnS0 4 H 2 0 004 g 

30 ZnS<V7H 2 0 0-28 g 

FeS0 4 -7H 2 0 0.40 g 



CaCl 2 Solution (100 ml) 
CaCl 2 -2H 2 6 



3.66 g 
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The salts solution and CaCl, solution are filter-sterilized and added to the autoclaved and 
cooked broth at the time of inoculation. Flasks are incubated at 30°C on a rotary shaker at 200 
rpm for 64 hr. 

The above procedure can be readily scaled up to large fermentors by procedures well 
known in the art. 

The Bacillus spores and/or crystals, obtained in the above fermentation, can be isolated 
by procedures well known in the arT A frequently-used procedure is to subject the harvested^ 
fermentation broth to separation techniques, e.g., centrifugation. 

E aasel s i - balaiign and. Preparation of Cellular PTSA for f CR 

DN A can be prepared from cells grown on Spizizen's agar, or other minimal or enriched 
agar known to those skilled in the art, for approximately 16 hours. Spizizen's casamino acid agar 
comprises 23.2 g/1 Spizizen's minimal salts [(NH^SO., 120 g; K 2 HP0 4 . 840 g; KH 2 PO„ 360 g; 
sodium citrate, 60 g;MgS0 4 -7H A 12g. Total: 1392 g); 1 .0 g/1 v.tamin-free casamino acids; 
15.0 g/1 Difco agar. In preparing the agar, the mixture was autoclaved for 30 minutes, then a 
sterile, 50% glucose solution can be added to a final concentration of 0.5% (1/100 vol). Once 
the cells are grown for about 1 6 hours, an approximately 1 cm 2 patch of cells can be scraped 
from the agar into 300 yl of 10 mM Tris-HCl (pH 8.0)- 1 mM EDTA. Proteinase K was added 
to 50 ug/ml and incubated at 55°C for 15 minutes. Other suitable proteases lacking nuclease 
activity can be used. The samples were then placed in a boiling water bath for 1 5 minutes to 
inactivate the proteinase and denature the DNA. This also precipitates unwanted components. 
The samples are then centrifuged at 14,000 x g in an Eppendorf microfuge at room temperature 
for 5 minutes to remove cellular debris. The supematants containing crude DNA were 
transferred to fresh tubes and frozen at -20°C until used in PCR reactions. 

Alternatively, total cellular DNA may be prepared from plate^grown cells using the 
QIAamp Tissue Kit from Qiagen (Santa Clarita, CA) following instructions from the 
manufacturer. 

Ffinm p l r 3 - Use of PCR E nm s ts is Qaasterize and/or Identify Toxin tones 

Two primers useful in PCR procedures were designed to identify genes that encode 
pesticidal toxins. Preferably, these toxins are active against lepidopteran insects. The DNA from 
95 B.t. strains was subjected to PCR using these pnmers. Two clearly distinguishable molecular 
weight bands were visible in "positive" strains, as outlined below. The frequency of strains 
yielding a 339 bp fragment was 29/95 (3 1%). This fragment is referred to herein as the "339 
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bp fragment" even though some small deviation in the exact number of base pairs may be 
observed. 

G ARCCRTGGA AAGCAAATAA TAARAATGC (SEQ ID NO. 1 ) 
AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2) 

" The strains which were positiveibr the 339 bp fragment (29 strains) were: PS I IB, 
PS31G1, PS36A, PS49C, PS81A2, PS81F, PS81GG, PS81I, PS85A1, PS86BB1, PS86V1, 
PS86W1, PS89J3, PS91C2, PS94R1, PS101DD, PS158C2, PS185U2, PS192M4, PS202S, 
PS213E5, PS218G2, PS244A2, HD29, HD1 10, HD129, HD525, HD573a, and Javelin 1990. 

The 24 strains which gave a larger (approximately 1.2 kb) fragment were: PS24J, 
PS33F2, PS45B1, PS52A1, PS62B1, PS80PP3, PS86A1, PS86Q3, PS88F16, PS92B, PS101Z2, 
PS123D1, PS157C1, PS169E, PS177F1, PS177G, PS185L2, PS201L1, PS204C3, PS204G4, 
PS242H10, PS242K17, PS244A2, PS244D1. 

It was found that Bacillus strains producing lepidopteran-active-proteins yielded only 
the 339 bp fragment. Few, if any, of the strains amplifying the approximately 1.2 kb fragment 
had known lepidopteran activity, but rather were coleopteran-, mite-, and/or nematode-achve 
B.t. crystal protein producing strains. 

Focamnle 4 - DNA Srqi.ftnc m g ^ Toxin Genes Producing the W Fragment 

PCR-amplified segments of toxin genes present in Bacillus strains can be readily 
sequenced. To accomplish this, amplified DNA fragments can be first cloned into the PCR 
DNA TA-cloning plasmid vector, pCRII, as described by the supplier (Invitrogen, San Diego, 
CA). Individual pCRU clones from the mixture of amplified DNA fragments from each Bacillus 
strain are chosen for sequencing. Colonies are lysed by boiling to release crude plasmid DNA. 
DNA templates for automated sequencing are amplified by PCR using vector-specific primers 
flanking the plasmid multiple cloning sites. These DNA templates are sequenced using Applied 
Biosystems (Foster City, CA) automated sequencing methodologies. The polypeptide sequences 
can be deduced from these nucleotide sequences. 

DNA from three of the 29 B.t strains which amplified the 339 bp fragments were 
sequenced. A DNA sequence encoding a toxin from strain PS36A is shown in SEQ ID NO. 3. 
An amino acid sequence for the 36A toxin is shown in SEQ ID. NO 4. A DNA sequence 
encoding a toxin from strain PS8 IF is shown in SEQ ID NO. 5. An ammo acid sequence for the 
81 F toxin is shown in SEQ ID. NO 6. A DNA sequence encoding a toxin from strain Javelin 
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1990 is shown in SEQ ID NO. 7. An amino acid sequence for the Javelin 1990 toxin is shown 
in SEQ ID. NO 8. 

Example 5 - Determination of DNA Se quences from Additional Genes Encoding Toxins from 
Strains PS158C2 and PS49C 

Genes encoding novel toxins were identified from isolates PS158C2 and PS49C as 
""follows: Total cellular DNA was extracted from B.t. strains using Qiagen (Santa Clarita, CA) 
Genomic-tip 500/G DNA extraction kits according to the supplier and was subjected to PCR 
using the oligonucleotide primer pairs listed below. Amplified DNA fragments were purified 
on Qiagen PCR purification columns and were used as templates for sequencing. 
For PS158C2, the primers used were as follows. 

158C2 PRIMER A; 

GCTCTAGAAGGAGGTAACTTATGAACAAGAATAATACTAAATTAAGC 
(SEQ ID NO. 9) 

339 reverse: 

AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2) 

The resulting PCR-amplified DNA fragment was approximately 2kbp in size. This DNA was 
partially sequenced by dideoxy chain termination using automated DNA sequencing technology 
(Pekin Elmer/Applied Biosystems, Foster City, CA). A DNA sequence encoding a portion of 
a soluble toxin from PS158C2 is shown in SEQ ID NO. 10. 

For PS49C, two separate DNA fragments encoding parts of a novel toxin gene were 
amplified and sequenced. The first fragment was amplified using the following primer pair: 

49C PRIMER A: 

CATCCTCCCTACACTTTCTAA (SEQ ID NO. 1 1) 
339 reverse; 

AAARTTATCT CCCCAWGCTT CATCTCCATT TTG (SEQ ID NO. 2) 

The resulting approximately 1 kbp DNA fragment was used as a template for automated DNA 
sequence. A sequence of a portion of a toxin gene from strain PS49C is shown in SEQ ID NO. 
12. 
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The second fragment was amplified using the following primer pair: 

49C PRIMER B: 

AAATTATGCGCTAAGTCTGC (SEQ ID NO. 13) 

49C PRIMER C: 

TTGATCCGGACATAATAAT (SEQ ID NO. 14) 
The resulting approximately 0.57 kbp DNA fragment was used as a template for automated 
DNA sequencing. An additional sequence of a portion of the toxin gene fr&m PS49C is shown 
in SEQ ID NO. 15. 

Eam alfi * - Addi t ion) *>™^ fffl Characterizing and/or Tdmtifvinp Twn Genes 
The following primer pair can be used to identify and/or characterize genes of the SUP-1 

family: 



SUP- 1 A: 

15 GGATTCGTTATCAGAAA (SEQ ID NO. 53) 

SUP- IB: 

CTGTYGCTAACAATGTC (SEQ ID NO. 54) 



20 



30 



These primers can be used in PCR procedures to amplify a fragment having a predicted size of 
approximately 370 bp.-A band of the predicted size was amplified from strains PS158C2 and 



PS49C. 



E Timr | r 7 _ ^i tmmi prii m n lissM tat nharartrririnp and% Identifying Toxin Gengs 

Another set of PCR primers can be used to identify and/or characterize additional genes 
25 encoding pesticidal toxins. The sequences of these primers were as follows: 

GGRTTAMTTGGRTAYTATTT (SEQ ID NO. 16) 
ATATCKWAYATTKGCATTTA (SEQ ID NO. 17) 
Redundant nucleotide codes used throughout the subject disclosure are in accordance 
with the IUPAC convention and include: 
R = A or G 
M = A or C 
Y = CorT 
K = G or T 
W = AorT 



WO 98/18932 PCT/US97/19804 

31 

E&am gk * ~ Idsniil icatiQD and Sequencing of Genes Encoding Novel Soluble Proton Toxins 
from Bacillus Strains 

PGR using primers SEQ ID NO. 16 and SEQ ID NO. 17 was performed on total cellular 
genomic DNA isolated from a broad range of Bt strains. Those samples yielding an 

5 approximately 1 kb band were selected for characterization by DNA sequencing. Amplified 

DNA fragments were first cloned into the PCR DNA TA-cloning plasmid vector, pCR2.1 , as 
described by the supplier (Invitrogen, San Diego, CA). Plasmids .were isolated from 
recombinant clones and tested for the presence of an approximately 1 kbp insert by PCR using 
the plasmid vector primers, T3 and T7. 

10 The following strains yielded the expected band of approximately 1000 bp, thus 

indicating the presence of a MIS-type toxin gene: PS10E1, PS31 J2, PS33D2, PS66D3, PS68F, 
PS69AA2, PS168G1, PS177C8, PS177I8, PS185AA2, PS196F3, PS196J4, PS197T1, PS197U2, 
PS202El,KB33,and KB38. 

Plasmids were then isolated for use as sequencing templates using QIAGEN (Santa 

1 5 Clarita, CA) miniprep kits as described by the supplier. Sequencing reactions were performed 

using the Dye Terminator Cycle Sequencing Ready Reaction Kit from PE Applied Biosystems. 
Sequencing reactions were run on a ABI PRISM 377 Automated Sequencer. Sequence data was 
collected, edited, and assembled using the ABI PRISM 377 Collection, Factura, and 
AutoAssembler software from PE ABI. 

20 DNA sequences were determined for portions of novel toxin genes from the following 

isolates: PS10E1, PS31 J2, PS33D2, PS66D3, PS68F, PS69AA2, PS168G1, PS177C8, PS177I8, 
PS185AA2, PS196F3, PS196J4, PS197T1, PS197U2, PS202E1, KB33, and KB38. Polypeptide 
sequences were deduced for portions of the encoded, novel soluble toxins from the following 
isolates: PS10E1, PS3IJ2, PS33D2, PS66D3, PS68F, PS69AA2, PS177C8, PS177I8, 

25 PS185AA2, PS196F3, PS196J4, PS1 97T1 , PS197U2, and PS202E1. These nucleotide sequences 

and amino acid sequences are shown in SEQ ID NOS. 18 to 48. 

Eaa mpjg 9 - Restrkn™ Frat7mPnt Length Polymorphism (BELE) af Tox i ns from Bacillus 

tfiurinfiens ix Strains 

30 Total cellular DNA was prepared from various Bacillus thuriengensis (B.t.) strains 

grown to an optical density of 0.5-0.8 at 600 nm visible light. DNA was extracted using the 
Qiagen Genomic-tip 500/G kit and Genomic DNA Buffer Set according to protocol for Gram 
positive bacteria (Qiagen Inc.; Valencia, CA). 
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Standard Southern hybridizations using "P-iableled probes were used to identity and 
characterize novel toxin genes within the total genomic DNA preparations. Prepared total 
genomic DNA was digested with various restriction enzymes, electrophoresed on a 1% agarose 
gel, and immobilized on a supported nylon membrane using standard methods (Maniatis et al). 

PCR-amplified DNA fragments 1.0-1.1 kb in length were gel purified for use as probes. 
Approximately 25 ng of each DNA fragment was used as a template for_priming nascent DNA 
synthesis using DNA~polymerase I Klenow fragment~(New England Biolabs), random 
hexanucieotide primers (Boehringer Mannheim) and "PdCTP. 

Each 32 P-lableled fragment served as a specific probe to its corresponding genomic DNA 
blot. Hybridizations of immobilized DNA with randomly labeled 32 P probes were performed in 
standard aqueous buffer consisting of 5X SSPE, 5X Denhardf s solution, 0.5% SDS, 0. 1 mg/ml 
at 65°C overnight. Blots were washed under moderate stringency in 0.2X SSC, 0.1% SDS at 
65°C and exposed to film. RFLP data showing specific hybridization bands containing all or 
part of the novel gene of interest was obtained for each strain. 



(Strain) / 
Gene Name 


Probe Seq I.D. 
Number 


RFLP Data (approximate band sizes) 


(PS)10E1 • 


18 


EcoRI: 4 and 9 kbp, EcoRV: 4.5 and 6 kbp, Kpnl: 12 
and 24 kbp, SacI: 13 and 24 kbp, Sail: >23 kbp, 
Xbal: 5 and 15 kbp 


(PS)3U2 


20 


Apal: >23 kbp, Bglll: 6.5 kbp, PstI: >23 kbp, SacI: 
>23 kbp, Sail: >23 kbp, Xbal: 5 kbp 


(PS)33D2 


22 


EcoRI: 10 kbp, EcoRV: 15 kbp, HindUl: 18 kbp, 
Kpnl: 9.5 kbp, PstI: 8 kbp 


(PS)66D3 


24 


BamHI: 4.5 kbp, Hindin: >23 kbp, Kpnl: 23 kbp, 
PstI: 15 kbp, Xbal: >23 kbp 


(PS)68F • 


26 


EcoRI: 8.5 and 15 kbp, EcoRV: 7 and 18 kbp, 
Hindffl: 2.1 and 9.5 kbp, PstI: 3 and 18 kbp, Xbal: 10 
and 15 kbp 


(PS)69AA2 


28 


EcoRV: 9.5 kbp, Hindin: 18 kbp, Kpnl: 23 kbp, 
Nhel: >23 kbp, PstI: 10 kbp, Sail: >23 kbp 


(PS)168G1 


30 


EcoRI: 10 kbp, EcoRV: 3.5 kbp, Nhel: 20 kbp, 
PstI: 20 kbp. Sail: >23 kbp, Xbal: 15 kbp 


(PS)177C8 


31 


HindUl: 2 kbp, Xbal: 1, 9 and 1 1 kbp 


(PS)177I8 


33 


BamHI: >23 kbp, EcoRI: 10 kbp, HindUl: 2 kbp, 
Sail: >23 kbp, Xbal: 3.5 kbp 


(PS)185AA2 


35 


EcoRI: 7 kbp, EcoRV: 10 kbp (&3.5kbp?), Nhel: 4 
kbp, PstI: 3 kbp, Sail: >23 kbp, Xbal: 4 kbp 


(PS)196F3 


37 


EcoRI: 8 kbp, EcoRV: 9 kbp, Nhel: 18 kbp, PstI: 18 
kbp, Sail: 20 kbp, Xbal: 7 kbp 
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15 



(Strain) / 


Probe Seq I.D. 
Number 


RFLP Data (approximate band sizes) 


(PS)196J4* 


39 


BamHl: >23 kbp, EcoRJ: 3.5 and 4.5 kbp, PstI: 9 and 
24 kbp, Sail: >23 kbp, Xbal: 2.4 and 12 kbp 


(PS)i97Tl 


41 


Hindlll: 10 kbp, Kpnl: 20 kbp, PstI: 20 kbp, Sad: 20 
Vhn ^n<»T- kbn Xbal" 5 kbD 


(PS)197U2 


43 


EcoRI:5kbp, EcoRV: 1.9 kbp, Nhel:20kbp, PstI: 
23 kbp, Sail: >23 kbp, Xbal: 7 kbp 


_(RS)202E1 


45 


EcoRV: 7 kbp, Kpnl: 12 kbp, Nhel: 10 kbp, PstI: 15 
kbp, Sail: 23 kbp, Xbal: 1.8 kbp 


KB33 


47 


EcoRl:9kbp, EcoRV: 6 kbp, Hindffi: 8 kbp, Kpnl: 
>23 kbp, Nhel: 22 kbp, Sail: >23 kbp 


KB38 


48 


BamHI: 5.5 kbp, EcoRV: 22 kbp, Hindlll: 2.2 kbp, 
Nhel: 20 kbp, PstI: >23 kbp 



•Enzymes used in genomic DNA digests were chosen on the basis of lacking recognition sites 
within the sequence of the PCR fragments used as probes for each sample except 177C8 for 
which the entire operon containing >1 Xbal site within the sequence was used). Strains indicated 
by asterisk contain more than one gene with high homology to the probeused, as indicated by 
the presence of multiple hybridizing bands. 

pvam plr m-U ^ f Actional P C F iaXOBi forChamctcrmne and/or Identifying NOVCl QCTSi 
Another set of PCR primers can be used to identify additional novel genes encoding 
peshcidal toxins. The sequences of these primers were as follows: 



20 



ICON-forward: 

CTTGAYTTTAAARATGATRTA (SEQ ID NO. 49) 

ICON-reverse: 

AATRGCSWATAAATAMGCACC (SEQ ID NO. 50) 



25 



These primers can be used in.PCR procedures to amplify a fragment having a predicted size of 
about 450 bp. 

Strains PS177C8, PS 17718, and PS66D3 were screened and were found to have genes 
amplifiable with these ICON primers. A sequence of a toxin gene from PS177C8 is shown in 
SEQ ID NO. 5 1 . An amino acid sequence of the 177C8-1CON toxin is shown in SEQ ID NO. 



52. 
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Ea amele LL^UsS ofMiSfid Prin n EaiB Sfi Characterize and/or identify Toxin Genes 

Various combinations of the primers described herein can be used to identify and/or 
characterize toxin genes. PCR conditions can be used as indicated below: 



SEP. IB NO - 49/S0 
94°C lmin. 
94°C lmin. 
42°C2min. 
72°C 3min. + 

5sec/cycl 

Repeat cycle 29 times Repeat cycle 29 times Repeat cycle 29 times 
Hold4»C Hold4°C Hold4°C 



<^oir>NO 16/17 
Pre-denature 94°C lmin. 
Program 94 °C lmin. 
Cycle 42 °C 2min. 

72 °C 3min.+ 

5sec/cycl 



^PQ in NO 49/17 
94"C lmin. 
94°C lmin. 

42°C2min. 

72"C 3min. + 

5sec/cycl 



Using the above protocol, a strain harboring a MIS-type of toxin would be expected to 
yield a 1000 bp fragment with the SEQ ID NO. 16/17 primer pair. A strain harbonng a WAR- 
type of toxin would be expected to amplify a fragment of about 475bp with the SEQ ID NO. 
49/50 primer pair, or a fragment of about 1 800 bp with the SEQ ID NO. 49/1 7 primer pair. The 
amplified fragments of the expected size were found in four strains. The results are reported in 
Table 3. 



Table 3. Approximate Amplified Fragment Sizes (bp) 
Strain SEQ ID NO. 16/17 SEQ IP NO. 49/50 SEQ ID NO. 49/17 



PS66D3 
PS177C8 
PS 17718 
PS217U2 



1000 
1000 
1000 
1000 



900,475 
475 
900, 550, 475 
2500. 1500. 900.475 



1800 
1800 
1800 
no band detected 



P^ ri* 1 2 - Qw slswtoa andZfli Mrnrifiration of WAR Toxins 

In a further embodiment of the subject invention, pesticidal toxins can be characterized 
and/or identified by their level of reactivity with antibodies to pesticidal toxins exemplified 
herein. In a specific embodiment, antibodies can be raised to WAR toxins such as the toxin 
obtainable from PS177C8a. Other WAR toxins can then be identified and/or characterized by 
their reactivity with the antibodies. In a preferred embodiment, the antibodies are polyclonal 
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have the greatest reactivity with the polyclonal antibodies. WAR toxins with greater d.vers.ty 
react with the 1 77C8a polyclonai antibod.es, but to a lesser extent. Toxins wh,ch immunoreact 
with polyclonal antibod.es raised to the 177C8a WAR toxin can be obtained from, for example, 
the .solates designated PS»77C8a. PS177I8. PS66D3, KB68B55-2, PS185Y2, PS146F, 
KB53A49-4 PS17514, KB68B51-2, PS28K1, PS31F2, KB58B46-2, and PS146D. Such d.verse 
WAR toxins can be further characterized by, for example, whether or not their genes can.be 
amplified with ICON pnmers. FoTexample, the following isolates do not have polynucleotide 
sequences which are amplified by ICON primers: PS177C8a, PS177I8, PS66D3. KB68B55-2, 
PS185Y2, PS146F, KB53A49-4, PS175I4, KB68B51-2, PS28K1, PS31F2, KB58B46-2, and 
PS146D. Of these, .solates PS28K1, PS31F2, KB68B46-2, and PS146D show the weakest 
antibody reactivity, suggesting advantageous diversity. 

r nmr) | r p PifTBW r -fer Activihr ftrainirt TCTidoptgnms atkI ColgQptcrms 

Biological activity of the toxins and isolates of the subject invention can be confirmed 
using standard bioassay procedures. One such assay is the budworm-bollworm (Heliothis 
virescens [Fabric.us] and Helicoverpa zea [Boddie]) assay. Lep.doptera bioassays were 
conducted with either surface application to artificial msect d.et or diet .ncorporation of samples. 
AH Lepidopteran insects were tested from the neonate stage to the second instar. All assays 
were conducted with either toasted soy flour artificial diet or black cutworm artifical d.et 

(BioServ, Frenchtown, NJ). 

Diet mcorporation can be conducted by mixmg the samples with artificial diet at a rate 
of 6 mL suspens.on plus 54 mL d.et. After vortexmg, this mixture is poured into plastic trays 
with compartmentalized 3-ml wells (Nutrend Conner Corporation, Jacksonvflle, FL). A water 
blank containing no B.t. serves as the control. First instar larvae (USDA-ARS, Stoneville, MS) 
are placed onto the diet mixture. Wells are then sealed with Mylar sheeting (ClearLam 
Packaging, IL) usmg a taclang iron, and several pinholes are made in each well to provide gas 
exchange. Larvae were held at 25'C for 6 days in a 14:10 (light:dark) holding room. Mortahty 
and stunting are recorded after six days. 

Bioassay by the top load method utilizes the same sample and diet preparations as listed 
above The samples are applied to the surface of the msect diet. In a specific embod.ment, 
surface area ranged from 0.3 to approximately 0.8 cm' depending on the tray size, 96 well tissue 
culture plates were used in addition to the format listed above. Following application, samples 
are allowed to air dry before insect infestation. A water blank contaming no B.t. can serve as the 
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control Eggs arc applied to each treated well and were then sealed with Mylar sheeting 
(ClearLam Packaging, IL) using a tacking iron, and pinholes are made in each well to prov,de 
gas exchange. B.oassays are held at 25'C for 7 days ,n a 14: 10 (Hght:dark) or 28°C for 4 days 
in a 14:10 (l.ght:dark) holding room. Mortality and insect stunting are recorded at the end of 
each bioassay. 

Another assay useful according to the subject invention is the Western corn rootworm 
assay Samples can be bioassayed against neonate western com rootwormlarvae (Diatmca 
"".irgifera virgifera) via top-loading of sample onto an agar-based artificial diet at a rate of .60 
m l/cm> Artificial diet can be dispensed into 0.78 cm> wells in 48-well tissue culture or smular 
plates and allowed to harden. After the diet solidifies, samples are dispensed by pipette onto the 
diet surface. Excess liquid is then evaporated from the surface prior to transfernng 
approximately three neonate larvae per well onto the diet surface by camel's hair brush. To 
prevent insect escape while allowmg gas exchange, wells are heat-sealed with 2-mil punched 
polyester film with 27HT adhesive (Oliver Products Company, Grand Ra P ,ds, M.ch.gan). 
B,oassays are held in darkness at 25'C, and mortality scored after four days. 

Analogous bioassays can be performed by those stalled in the art to assess actmty 
against other pests, such as the black cutworm {Agrotis ipsilon). 

Results are shown in Table 4. 
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^amplUi ^salts of Western r m Mm 

Concentrated liquid supernatant solutions, obtamed according to the subject .nventton, 
were tested for activity against Western corn rootworm (WCRW). Supematants from the 
following isolates were found to cause mortality against WCRW: PS10E1, PS31F2, PS3U2, 
PS33D2 PS66D3, PS68F, PS80JJ1, PS146D, PS175I4, PS177I8, PS196J4, PS197T1. PS197U2, 
KB33, KB53A49-4, KB68B46-2, KB68B51-2, KB68B55-2, PS177C8, PS69AA2, KB38, 
PS196F3, PS168G1, PS202E1, PS217U2 and PS185AA2. 

Example, ] S - Res ult s if T*^«™m/Bollwonn Bioassavs 

Concentrated hquid supernatant solutions, obtamed according to the subject mvenuon. 
were tested for activity against Heliolhis virescens (H,,) and Helicorerpa zea (Hz ). 
Supematants from the following isolates were tested and were found to cause mortahty agamst 
H,- PS157C1. PS31G1, PS49C, PS81F, PS8H, Javelin 1990. PS158C2, PS202S, PS36A, 
HDl 10, and HD29. Supematants from the followmg isolates were tested are were found to 

♦ u, • pcufil PS49C PS81F, PS81I, PS157C1, PS158C2, 
cause significant mortality against H.z.: PS3101, wst, raoir, 

PS36A, HDUO, and Javelin 1990. 

F M t nple 16 -Target PeStS 

Toxins of the subject mvention can be used, alone or in combination with other toxins, 
to control one or more-non-mammalian pests. Tnese pests may be, for example, those listed in 
Table 5. Activity can readny be confirmed using the bioassays provtded herein, adaptanons of 
these bioassays, and/or other b.oassays well known to those skilled .n the art. 



TableS. Target pest species 



QRPER/Common Name 

LEPIDOPTERA 

European Com Borer 

European Com Borer resistant to Cryl Ab 

Black Cutworm 

Fall Armyworm 

Southwestern Com Borer 

Com Earworm/Bollworm 

Tobacco Budwbrm 



Latin Name 

Ostrinia nubilalis 
Ostrinia nubilalis 
Agrotis ipsilon 
Spodoptera frugiperda 
Diatraea grandiosella 
Helicoverpa zea 
Heliolhis virescens 
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ORDER/Coramon Name 



Latin Name 



Tobacco Budworro Rs 
Sunflower Head Moth 
Banded Sunfloww Moth 
Argentine Looper 
Spilosoma 
Bertha Armyworm 
Diamondb ack Moth 
COLEOPTERA 
Red Sunflower Seed Weevil 
Sunflower Stem Weevil 
Sunflower Beetle 
Canola Flea Beetle 
Western Com Rootworm 
DIPTERA 
H essian Fly 
HOMOPTERA 
Greenbug 
HEMIPTERA 
Lygus Bug 
NEMATODA 



Heliothis virescens 
Homeosoma ellectellum 
Cochylis hospes 
Rachiplusia nu 
Spilosoma virginica 
" Mamestra configurata 
Plutella xyl ostells 

Smicronyxfulvus 
Cylindrocopturus adspersus 
Zygoramma exclamationis 

Phyllotreta cruciferae 
Diabro tica virgifera virgifera 

Mayetiola destructor 

Schiz aphis graminum 

Lygus l ineolaris 
Heterodera glycines 



Yr m ?] r 1 7 - insert ion Pi Tnr in Gsngj tola ^ants _ 

Oneaspectofthe sublet invention is tetn*^*^™*™^ 
ft. im ^M »»n of P-« i— * *" M >mCk " 

the target pest. 

Genes encodmg pesticidal toxins, as disclosed herein, can be mserted into plant cells 
using a vancty of techniques wh,ch are well known in the art. For example, a large number of 

of the transformed cells are available for preparation for the insertion of foreign genes mto 
hlgh er plants. The vectors compnse, for example, P BR322, pUC series, M13mp senes, 
pACYClHetc. Accordingly,^ sequence encoding ^Bacillus toxin can be mserted mto the 
vector at a suitable restriction site. Tne renting plasnud is used for transformation into £. co«. 
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The E. coli cells are cultivated in a suitable nutrient medium, then harvested and lysed. The 
plasmid is recovered. Sequence analysis, restriction analysis, electrophoresis, and other 
biochemical-molecular biological methods are generally carried out as methods of analysis. 
After each manipulation, the DNA sequence used can be cleaved and joined to the next DNA 
sequence. Each plasmid sequence can be cloned in the same or other plasmids. Depending on 
the method of inserting desired genes into the plant, other DNA sequences may be necessary. 
If, for example, the Ti or Ri plasmid is used for the transformation of the plant cell, then at least 
"the right border, but often the ri^Tand the left border of the Ti or Ri plasmid T-DNA, has to be 
joined as the flanking region of the genes to be inserted. 

The use of T-DNA for the transformation of plant cells has been intensively researched 
and sufficiently described in EP 120 5 1 6; Hoekema (1 985) In: The Binary Plant Vector System, 
Offset-durkkerij Kanters B.V., Alblasserdam, Chapter 5; Fraley el al, Crit. Rev. Plant Sci. 4:1- 
46; and An et al. (1985) EMBO J. 4:277-287. 

Once the inserted DNA has been integrated in the genome, it is relatively stable there 
and, as a rule, does not come out again. It normally contains a selection marker that confers on 
the transformed plant cells resistance to a biocide or an antibiotic, such as kanamycin, G 41 8, 
bleomycin, hygromycin, or chloramphenicol, inter alia. The individually employed marker 
should accordingly permit the selection of transformed cells rather than cells that do not contain 
the inserted DNA. 

A large number of techniques are available for inserting DNA into a plant host cell. 
Those techniques include transformation with T-DNA using Agrobacterium tumefaciens or 
Agrobacterium rhizogenes as transformation agent, fusion, injection, biolistics (microparticle 
bombardment), or electroporation as well as other possible methods. If Agrobactena are used 
for the transformation, the DNA to be inserted has to be cloned into special plasmids, namely 
either into ^intermediate vector or into a binary vector. The intermediate vectors can be 
integrated into the Ti or Ri plasmid by homologous recombination owing to sequences that are 
homologous to sequences in the T-DNA. The Ti or Ri plasmid also comprises the vir region 
necessary for the transfer of the T-DNA. Intermediate vectors cannot replicate themselves in 
Agrobacteria. The intermediate vector can be transferred into Agrobacterium tumefaciens by 
means of a helper plasmid (conjugation). Binary vectors can replicate themselves both in E. coli 
and in Agrobacteria. They comprise a selection marker gene and a linker or polylinker which 
are framed by the right and left T-DNA border regions. They can be transformed directly into 
Agrobacteria (Holsters etal. [1978] Mol. Gen. Genet. 163:181-187). The Agrobacterium used 
as host cell is to comprise a plasmid carrying a vir region. The vir region is necessary for the 
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transfer of the T-DNA into the plant cell. Additional T-DNA may be contained. The bacterium 
so transformed is used for the transformation of plant cells. Plant explants can advantageously 
be cultivated with Agrobacterium tumefaciens or Agrobacterium rhizogenes for the transfer of 
the DNA into the plant cell. Whole plants can then be regenerated from the infected plant 

5 material (for example, pieces of leaf, segments of stalk, roots, but also protoplasts or suspension- 

cultivated cells) in a suitable medium, which may contain antibiotics or biocides for selection. 

The plants so obtained can then be tested for the presence of the inserted DNA. No special 

demands are made of the plasmids in the case of injection and clectroporation. It is possible to 
use ordinary plasmids, such as, for example, pUC derivatives. In bioltstic transformation, 

10 plasmid DNA or linear DNA can be employed. 

The transformed cells are regenerated into morphologically normal plants in the usual 
manner. If a transformation event involves a germ line cell, then the inserted DNA and 
corresponding phenotypic trails) will be transmitted to progeny plants. Such plants can be 
grown in the normal manner and crossed with plants that have the same transformed hereditary 

15 factors or other hereditary factors. The resulting hybrid individuals have the corresponding 

phenotypic properties. 

In a preferred embodiment of the subject invention, plants will be transformed with 
genes wherein the codon usage has been optimized for plants. See, for example, U.S. Patent No. 
5,380,83 1 . Also, advantageously, plants encoding a truncated toxin will be used. The truncated 
20 toxin typically will encode about 55% to about 80% of the full length toxin. Methods for 
creating synthetic Bacillus genes for use in plants are known in the art. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
25 suggested to persons skilled in the art and are to be included within the spirit and purview of this 
application. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 



Applicant Name (s) : 
Street address : 
City : 
State/Province : 
Country : 
Postal code/Zip: 
Phone number: 



MYCOGEN CORPORATION 
5501 Oberlin Drive 
San Diego 
California 
US 

92121 

(619) 453-8030 Fax number: 



(619) 453-6991 



(ii) TITLE OF INVENTION: Novel Pesticidal Toxins and Nucleotide 
Sequences Which Encode These Toxins 



(iii) NUMBER OF SEQUENCES: 134 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Saliwanchik, Lloyd & Saliwanchik 

(B) STREET: 2421 N.W. 41st Street, Suite A-l 

(C) CITY: Gainesville 

(D) STATE: FL 

(E) COUNTRY: US 

(F) ZIP: 32606-6669 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release Hi . 0 , Version #1.30 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/029,848 

(B) FILING DATE: 30-OCT-1996 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Saliwanchik, David R. 

(B) REGISTRATION NUMBER: 39,355 

(C) REFERENCE/DOCKET NUMBER: MA- 708 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 352-375-8100 

(B) TELEFAX: 352-372-5800 



(2) INFORMATION FOR SEQ ID NO:l: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 2 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GARCCRTGGA AAGCAAATAA TAARAATGC . 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
AAARTTATCT CCCCAWGCTT CATCTCCATT TTG 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 36a — 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 12 0 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 300 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 
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ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 
TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 
CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 
GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 
TCTCCTGCAA ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 
*AAAATGATG TGGATGGTTT TGAATTTTAC CTTAATAC AT^TCC ACGATGT" AATGGT AGGA 
AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 
GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 
CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 
ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 
AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 
AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 
GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 
GTAATTACTA AAATTG ATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 
GAAGCGGAGT ATAAAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 
AGCAATAAAG AAACTAAATT GATCGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 
ATTTCACAAT TTATTGGAGA TAATTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 
GTTAAAGGAA AACCTTCTAT TCATTTAATA GATGAAAATA CTGGATATAT TCATTATGAA 
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 



420 
480 
540 
600 
660 
720 
" 780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 

GTOTATTTTT ctgtgtccgg agatgctaat gtaaggatta GAAATTCTAG GGAAGTGTTA 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGQGAATA ATTTATATGG TGGTCCTATT 2340 
GTACATTTTT ACGATGTCTC TATTAAGTAA CCCAA 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 790 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 36a 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 



1 



5 



Ile Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Aep 

20 25 
He Met Asn Met lie Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 
35 40 . " 



A sp Glu lie Leu Lys Asn Gin Gin Leu Leu Asn Asp He Ser Gly Lys 

50 55 60 

Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 " 

L eu Aan Thr Glu Leu Ser Lys Glu Ile Leu Lys lie Ala Asn Glu Gin 
85 90 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala lie Asn Thr 
100 105 

„et Leu Arg Val Tyr Leu Pro Lys He Thr Ser Met Leu Ser Asp Val 

115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 



135 140 
130 1Jb 
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Gin Leu Gin Glu He Ser Asp Lys Leu Asp He He Asn Val Asn Val 
145 150 155 160 

Leu He Asn Ser Thr Leu Thr Glu He Thr Pro Ala Tyr Gin Arg He 
165 170 175 

Lys Tyr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asn He Leu Asp Glu 
195 200 205 

Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Asp Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 
225 230 235 240 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 255 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 270 

Asn Phe Leu He Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Leu Thr Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp He Asp Tyr Thr 
290 295 300 

Ser He Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 
305 310 315 320 

Asn He Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 335 

Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met He Val Glu Ala Lys 
340 345 350 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser lie Thr 
355 360 365 

Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 
370 375 380 



Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 

385 390 395 400 

Cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 

405 410 415 



Pro Asn Glu Tyr Val He Thr Lys lie Asp Phe Thr Lys Lys Met Lys 
420 425 430 
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Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 
435 440 445 

Glu lie Asp Leu Aan Lya Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 
450 455 460 

Lys Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 
465 470 475 480 

lie Ser Glu Thr Phe Leu Thr Pro lie Asn Gly Phe Gly Leu Gin Ala 

485 „J?° 495 

Asp Glu Asn Ser Arg Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 505 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys Leu lie 
515 520 525 

Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
545 550 555 560 

Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly He Ser Gin Phe He Gly Asp Asn Leu Lys Pro Lys 
580 585 590 

Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 
595 600 605 

Leu He Asp Glu Asn Thr Gly Tyr He His Tyr Glu Asp Thr Asn Asn 
610 615 620 

Asn Leu Glu Asp Tyr Gin Thr He Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 

Asp Leu Lys Gly Val Tyr Leu He Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe He He Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu He Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn lie Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 ' 710 715 720 
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Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn Ser 
725 730 735 

Arg Glu Val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 
740 745 750 

Ser Glu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Tyr Gly Gly Pro He Val His Phe_Tyr 
770" _ — 7-7 5_ 780 

Asp Val Ser He Lys Pro 
7B5 790 

(2) INFORMATION FOR SKQ ID NO:5: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2370 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 81Fd 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 60 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 120 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 180 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 24 0 

TTAAATACAG AATTATCTAA AGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA . 3 00 

AATGATGTTG ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 360 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 420 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 480 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 54 0 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 600 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 660 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 720 
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AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 780 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 84 0 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 900 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 960 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 1020 

AGTGATGAAG AT G CAAAG AT GATTGTGGAA GCTAAACCAG GACATGCATT GGTTGGGTTT ~1080 

GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 114 0 

TATCAAGTTG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 1200 

TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 1260 

GTAATTACTA AAATTGATTT TACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 132 0 

AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 1380 

GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGAG TGTATATGCC GTTAGGTGTC 14 4 0 

ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 1500 

AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 1560 

AGCAATAAAG AAACTAAATT GATCGTCCCG CCCAGTGGTT TTATTAAAAA TATTGTAGAG 1620 

AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGAGTAT 1680 

GTAGATCATA CAGGCGGAGT GAATGGRACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 174 0 

ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 1800 

GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 1860 

GATACAAATA ATAATTTAGA AGATTATCAA ACTATTACTA AACGTTTTAC TACAGGAACT 1920 

GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 1980 

AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 2040 

ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 2100 

CAGGGAGGAC GAGGAATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 2160 

GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 2220 

TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATTTTCAC TACAAAATTT 2280 

GGGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTAAATGG TGGCCCTATT 234 0 

GTACAGTTTC CCGATGTCTC TATTAAGTAA 2370 
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) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 81Fd 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 
a 5 10 15 

lie Asp Tyr Phe Asn Gly lie Tyr Gly Phe Ala Thr Gly lie Lys Asp 
20 25 30 

lie Met Asn Met lie Phe Lys Thr Asp Thr Gly Gly Asp Leu Thr Leu 



35 



40 



45 



Asp 



Glu He Leu Lys Asn Gin Gin Leu Leu Asn Asp lie Ser Gly Lys 



50 



55 



60 



Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 75 

Leu Asn Thr Glu Leu Ser Lys Glu lie Leu Lys lie Ala Asn Glu Gin 
85 90 95 

Asn Gin Val Leu Asn Asp Val Asp Asn Lys Leu Ab P Ala lie Asn Thr 
100 105 110 

Met Leu Arg Val Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 
115 120 125 

Met Lys Gin Asn Tyr Ala Leu Ser Leu Gin lie Glu Tyr Leu Ser Lys 
130 135. 140 



Gin Leu 
145 



Gin Glu lie Ser Asp Lys Leu Asp He He Asn Val Asn Val 
150 155 160 



Leu lie Asn Ser Thr Leu Thr Glu lie Thr Pro Ala Tyr Gin Arg lie 
165 170 175 

Lys TVr Val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 190 

Ser Ser Lys Val Lys Lys Asp Gly Ser Pro Ala Asp lie Leu Asp Glu 
195 200 205 
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Leu Thr Glu Leu Thr Glu Leu Ala Lys Ser Val Thr Lys Asn Asp Val 
210 215 220 

Gly Phe Glu Phe Tyr Leu Asn Thr Phe His Asp Val Met Val Gly 



Asp Gly Phe Giu vne iyi ™ - - - 

225 230 235 

Asn Asn Leu Phe Gly Arg Ser Ala Leu Lys Thr Ala Ser Glu Leu He 
245 250 " 5 

Thr Lys Glu Asn Val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 

260 265^ 

Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ala Lys Ala Phe Leu Thr 
275 280 285 

Thr Cys Arg Lys Leu Leu Gly Leu Ala Asp lie Asp Tyr Thr 



Leu Thr 
290 



295 



300 



ser lie Met Asn Glu His Leu Asn Lys Glu Lys Glu Glu Phe Arg Val 



305 



310 



A sn lie Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 
325 330 



Lys Val Lys Gly Ser Asp Glu Asp Ala Lys Met lie Val Glu Ala Lys 

340 345 
Pro Gly His Ala Leu Val Gly Phe Glu lie Ser Asn Asp Ser lie Thr 

355 360 
Val Leu Lys Val Tyr Glu Ala Lys Leu Lys Gin Asn Tyr Gin Val Asp 

370 3/ * 
Lys Asp Ser Leu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 
385 290 

cys Pro Asp Gin Ser Glu Gin He Tyr Tyr Thr Asn Asn He Val Phe 



405 



Pro Asn Glu Tyr Val lie Thr Lys II. Asp Phe Thr Lys Lys Met Lys 



420 



425 



Thr Leu Arg Tyr Glu Val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 

435 440 445 

0 1» lie AS P Leu Asn Lys Lys Lys Val Glu Ser Ser Glu Ala Glu Tyr 

450 "55 460 

Ar, Thr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 



465 



470 



lie Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 



485 490 
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Asp Glu Asn Ser Arg Leu lie Thr Leu Thr Cys Lys Ser Tyr Leu Arg 
500 S05 510 

Glu Leu Leu Leu Ala Thr Asp Leu Ser Asn Lye Glu Thr Lys Leu lie 
515 520 525 

Val Pro Pro Ser Gly Phe lie Lys Asn He Val Glu Asn Gly Ser He 
530 535 540 

Glu Glu Asp Asn Leu Glu Pro Trp Lys Ala Asn Asn Lys Asn Glu Tyr 
545 -550 555 560 

Val Asp" His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 575 

Lys Asp Gly Gly lie Ser Gin Phe lie Gly Asp Lys Leu Lys Pro Lys 



580 



585 



590 



Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser lie His 

600 605 



595 



Leu Lys Asp Glu Asn Thr Gly Tyr lie His Tyr Glu Asp Thr Asn Asn 
610 615 620 



Asn Leu Glu Asp Tyr Gin Thr lie Thr Lys Arg Phe Thr Thr Gly Thr 
625 630 635 640 



Asp Leu Lys Gly Val Tyr Leu lie Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 655 

Ala Trp Gly Asp Asn Phe lie lie Leu Glu He Ser Pro Ser Glu Lys 
660 665 670 

Leu Leu Ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 
675 680 685 

Ser Thr Asn He Ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Arg 
690 695 700 

Gly He Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arg 
705 710 715 720 



Val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Arg He Arg Asn 
725 730 735 



Ser 



Aro Glu val Leu Phe Glu Lys Arg Tyr Met Ser Gly Ala Lys Asp Val 

745 750 



740 

Ser Glu He Phe Thr Thr Lys Phe Gly Lys Asp Asn Phe Tyr He Glu 
755 760 765 

Leu Ser Gin Gly Asn Asn Leu Asn Gly Gly Pro He Val Gin Phe Pro 
770 775 780 
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Asp Val Ser He Lys 
785 

(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

— (n) TOPOLOGY: linear_ 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Jav90 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATGAACAAGA ATAATACTAA ATTAAGCACA AGAGCCTTAC CAAGTTTTAT TGATTATTTT 

AATGGCATTT ATGGATTTGC CACTGGTATC AAAGACATTA TGAACATGAT TTTTAAAACG 

GATACAGGTG GTGATCTAAC CCTAGACGAA ATTTTAAAGA ATCAGCAGTT ACTAAATGAT 

ATTTCTGGTA AATTGGATGG GGTGAATGGA AGCTTAAATG ATCTTATCGC ACAGGGAAAC 

TTAAATACAG AATTATCTAA GGAAATATTA AAAATTGCAA ATGAACAAAA TCAAGTTTTA 

AATGATGTTA ATAACAAACT CGATGCGATA AATACGATGC TTCGGGTATA TCTACCTAAA 

ATTACCTCTA TGTTGAGTGA TGTAATGAAA CAAAATTATG CGCTAAGTCT GCAAATAGAA 

TACTTAAGTA AACAATTGCA AGAGATTTCT GATAAGTTGG ATATTATTAA TGTAAATGTA 

CTTATTAACT CTACACTTAC TGAAATTACA CCTGCGTATC AAAGGATTAA ATATGTGAAC 

GAAAAATTTG AGGAATTAAC TTTTGCTACA GAAACTAGTT CAAAAGTAAA AAAGGATGGC 

TCTCCTGCAG ATATTCTTGA TGAGTTAACT GAGTTAACTG AACTAGCGAA AAGTGTAACA 

AAAAATGATG TGGATGGTTT TGAATTTTAC CTTAATACAT TCCACGATGT AATGGTAGGA 

AATAATTTAT TCGGGCGTTC AGCTTTAAAA ACTGCATCGG AATTAATTAC TAAAGAAAAT 

GTGAAAACAA GTGGCAGTGA GGTCGGAAAT GTTTATAACT TCTTAATTGT ATTAACAGCT 

CTGCAAGCAA AAGCTTTTCT TACTTTAACA ACATGCCGAA AATTATTAGG CTTAGCAGAT 

ATTGATTATA CTTCTATTAT GAATGAACAT TTAAATAAGG AAAAAGAGGA ATTTAGAGTA 

AACATCCTCC CTACACTTTC TAATACTTTT TCTAATCCTA ATTATGCAAA AGTTAAAGGA 

AGTGATGAAG ATGCAAAGAT GATTGTGGAA GCTAAACCAG GACATGCATT GATTGGGTTT 



60 
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GAAATTAGTA ATGATTCAAT TACAGTATTA AAAGTATATG AGGCTAAGCT AAAACAAAAT 
TATCAAGTCG ATAAGGATTC CTTATCGGAA GTTATTTATG GTGATATGGA TAAATTATTG 
TGCCCAGATC AATCTGAACA AATCTATTAT ACAAATAACA TAGTATTTCC AAATGAATAT 
GTAATTACTA AAATTGATTT CACTAAAAAA ATGAAAACTT TAAGATATGA GGTAACAGCG 
AATTTTTATG ATTCTTCTAC AGGAGAAATT GACTTAAATA AGAAAAAAGT AGAATCAAGT 
— GAAGCGGAGT ATAGAACGTT AAGTGCTAAT GATGATGGGG TGTATATGCC GTTAGGTGTC 
ATCAGTGAAA CATTTTTGAC TCCGATTAAT GGGTTTGGCC TCCAAGCTGA TGAAAATTCA 
AGATTAATTA CTTTAACATG TAAATCATAT TTAAGAGAAC TACTGCTAGC AACAGACTTA 
AGCAATAAAG AAACTAAATT GATYGTCCCG CCAAGTGGTT TTATTAGCAA TATTGTAGAG 
AACGGGTCCA TAGAAGAGGA CAATTTAGAG CCGTGGAAAG CAAATAATAA GAATGCGTAT 
GTAGATCATA CAGGCGGAGT GAATGGAACT AAAGCTTTAT ATGTTCATAA GGACGGAGGA 
ATTTCACAAT TTATTGGAGA TAAGTTAAAA CCGAAAACTG AGTATGTAAT CCAATATACT 
GTTAAAGGAA AACCTTCTAT TCATTTAAAA GATGAAAATA CTGGATATAT TCATTATGAA 
GATACAAATA ATAATTTAGA AGATTATCAA ACTATTAATA AACGTTTTAC TACAGGAACT 
GATTTAAAGG GAGTGTATTT AATTTTAAAA AGTCAAAATG GAGATGAAGC TTGGGGAGAT 
AACTTTATTA TTTTGGAAAT TAGTCCTTCT GAAAAGTTAT TAAGTCCAGA ATTAATTAAT 
ACAAATAATT GGACGAGTAC GGGATCAACT AATATTAGCG GTAATACACT CACTCTTTAT 
CAGGGAGGAC GAGGGATTCT AAAACAAAAC CTTCAATTAG ATAGTTTTTC AACTTATAGA 
GTGTATTTTT CTGTGTCCGG AGATGCTAAT GTAAGGATTA GAAATTCTAG GGAAGTGTTA 
1 TTTGAAAAAA GATATATGAG CGGTGCTAAA GATGTTTCTG AAATGTTCAC TACAAAATTT 
GAGAAAGATA ACTTTTATAT AGAGCTTTCT CAAGGGAATA ATTTATATGG TGGTCCTATT 
• GTACATTTTT ACGATGTCTC TATTAAGTAA CCCAA 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 790 amino acids 

(B) TYPE: amino acid 

(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Jav90 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : B : 

M et Asn Lys Asn Asn Thr Lys Leu Ser Thr Arg Ala Leu Pro Ser Phe 

1 5 

B w aen rlv lie Tvr Gly Phe Ala Thr Gly He Lys Asp 
lie Asp Tyr Phe Asn Gly lie ryr ^ 

20 25 
Xle Met Asn Met He Phe Lys "thr Asp Thr Gly OlyMP *u ThrTeu 

35 40 



ASP Glu lie Leu Lys Asn Gin Gin Leu Leu Asn Asp lie Ser Gly Lys 

50 55 
Leu Asp Gly Val Asn Gly Ser Leu Asn Asp Leu lie Ala Gin Gly Asn 
65 70 " 

L eu Asn Thr Glu Leu Ser Lys Glu lie Leu Lys lie Ala Asn Glu Gin 
85 90 

Asn Gin Val Leu Asn Asp Val Asn Asn Lys Leu Asp Ala lie Asn T h r 

100 1Db 
Mt Leu Arg Val Tyr Leu Pro Lys lie Thr Ser Met Leu Ser Asp Val 

115 120 
net Lys Gin Asn Tyr Ala Leu Ser Leu Gin Xle Glu Tyr Leu Ser Lys 



130 



135 



„i ri , Tie ser Asp Lys Leu Asp He He Asn Val Asn Val 
Gin Leu Gin Glu lie Ser Asp uya r lgQ 

145 150 



uu Xle Asn Ser Thr Leu Thr Glu Xle Thr Pro Ala Tyr Gin Arg lie 
165 17U 

Lys ryr val Asn Glu Lys Phe Glu Glu Leu Thr Phe Ala Thr Glu Thr 
180 185 

Sar Sar W . Val *. W. «P «y Sar Pro »• »> £ - -» 

195 200 
MO Thr Glu Lea Thr 01. U. «. Ly. «. val Thr ly. a.. »P val 

„p X Pha =1. » Tyr l- » Thr Ph. tt. W val -a. ... 01, 
225 230 

. «i *™ q e r Ala Leu Lys Thr Ala Ser Glu Leu lie 
Asn Asn Leu Phe Gly Arg Ser Ala Leu y 

245 25U 
Thr Lys Glu Asn val Lys Thr Ser Gly Ser Glu Val Gly Asn Val Tyr 
260 265 
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Asn Phe Leu lie Val Leu Thr Ala Leu Gin Ma Lys Ala Phe Leu Thr 

275 290 
L eu Thr Thr Cys Ar 9 Lys Leu ,eu Gly Leu Ala Asp He Asp Tyr Thr 

290 295 
ser ne Met Asn Olu His Leu Asn Lys Clu Lys Glu alu Phe Ar 9 Val 
305 310 

Aa n xi. Leu Pro Thr Leu Ser Asn Thr Phe Ser Asn Pro Asn Tyr Ala 

325 — iiU — 

Ly . V^„= Oy «T »P MP U. LV «et U. V,! «. U. «. 
340 345 

Pro Gly His Ala Leu He Gly Phe Glu He Ser Asn Asp Ser He Thr 

355 360 
val Leu Lys Val Tyr Glu Ala L ys Leu Lys Gin Asn Tyr Gin Val Asp 

370 375 

- Ly8 ABP ser ,eu Ser Glu Val He Tyr Gly Asp Met Asp Lys Leu Leu 



385 390 



r>i , n n Tie Tvr Tyr Thr Asn Asn He Val Phe 
Cys Pro Asp Gin Ser Glu Gin He Tyr lyr 4U 



405 



m » Olu Tyr VI n. n. Ly. U. »» - * "V WJ ™ Ly. 

420 425 
Th r Leu Arg Tyr Glu val Thr Ala Asn Phe Tyr Asp Ser Ser Thr Gly 

435 440 

Glu xle Asp Leu Asn Lys Lys Lys Val Glu Ser ser Glu Ala Glu Tyr 

450 455 

Arg T hr Leu Ser Ala Asn Asp Asp Gly Val Tyr Met Pro Leu Gly Val 

465 470 

Ile Ser Glu Thr Phe Leu Thr Pro He Asn Gly Phe Gly Leu Gin Ala 
485 

^ Clu Asn Ser Ar 9 Leu He Thr Leu Thr Cys Lys Ser Tyr Leu Ar 9 

500 bQb 
Glu L eu Leu Leu Ala Thr Asp Leu Ser Asn Lys Glu Thr Lys ,eu He 

515 520 
Val Pro Pro Ser Gly Phe He Ser Asn He Val Glu Asn Gly Ser Ile 
530 535 

* t«« r-iu Pro Trp Lys Ala Asn Asn Lys Asn Ala Tyr 
Glu Glu Asp Asn Leu Glu Pro irp uy 5g(J 



545 550 
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Val Asp His Thr Gly Gly Val Asn Gly Thr Lys Ala Leu Tyr Val His 
565 570 

Lys Asp Gly Gly He Ser Gin Phe lie Gly Asp Lys Leu Lys Pro Lys 

580 585 
Thr Glu Tyr Val He Gin Tyr Thr Val Lys Gly Lys Pro Ser He His 

595 600 



^ Lys Asp Glu Asn Thr Gly Tyr lie His Tyr GluAsp Thr Asn Asn 

610 — 615 _ _ 

As „ Leu Glu Asp Tyr Gin Thr lie Asn Lys Arg Phe Thr Thr Gly Thr 
625 630 " 5 

Asp Leu Lys Gly Val Tyr Leu lie Leu Lys Ser Gin Asn Gly Asp Glu 
645 650 

«a n «y »P » «- »« - 01 » ~ "° V,l "* ly ' 

660 66b 
Leu Leu ser Pro Glu Leu lie Asn Thr Asn Asn Trp Thr Ser Thr Gly 

675 680 
ser Thr As n II. ser Gly Asn Thr Leu Thr Leu Tyr Gin Gly Gly Ar 9 



690 



Gly ne Leu Lys Gin Asn Leu Gin Leu Asp Ser Phe Ser Thr Tyr Arj 
705 710 



val Tyr Phe Ser Val Ser Gly Asp Ala Asn Val Ar 9 He Ar 9 Asn Ser 
725 /3U 

Arg ciu val Leu Phe Glu Lys Ar 9 Tyr Met Ser Gly Ala Lys Asp Val 
74 0 74 * 

ser Olu Met Phe Thr Thr Lys Phe Glu Lys Asp Asn Phe Tyr He Glu 

L eu ser Gin Gly Asn^Asn Leu Tyr Gly Gly Pro lie Val His Phe Tyr 
770 775 

Asp Val Ser He Lys Pro 
785 790 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANCEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic 



47 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GCTCTAGAAG GAGGTAACTT ATGAACAAGA ATAATACTAA ATTAAGC 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2035 base pairs 

(B) TYPE: nucleic acid 
— tC) STRANDEDNESS : single 

— (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 15 8C2-ptl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ATGAACAAGA ATAATACTAA ATTAAGCGCA AGGGCCTACC GAGTTTTATT GATTATTTTA 
ATGGCATTTA TGGATTTGCC ACTGGTATCA AAGACATTAT GAATATGATT TTTAAAACGG 
ATACAGGTGG TAATCTAACC TTAGACGAAA TCCTAAAGAA TCAGCAGTTA CTAAATGAGA 
TTTCTGGTAA ATTGGATGGG GTAAATGGGA GCTTAAATGA TCTTATCGCA CAGGGAAACT 
TAAATACAGA ATTAGCTAAG CAAATCTTAA AAGTTGCAAA TGAACAAAAT CAAGTTTTAA 300 
ATGATGTTAA TAACAAACTA GACTGCGATA AATACGATGC TTAAAATATA TCTACCTAAA 360 
ATTCACATCT ATGTTAAGTG ATGTACTGAA GCCAAAATTA TGTGCTTAAG TCTTGCAAAT 
TGGAATTACC TTTAAGTAAC ATCTGCACCT TGGCAAGAAA TCTCCGACAA GCTAGATATT 
ATTAACGTAA ATGTGCTTAT TAACTCTACG CTTACTGAAA TTACACCTGC GTATCAACGA 
ATTAAATATG TGAATGAAAA ATTTGACGAT TTAACTTTTG CTACAGAAAA CACTTTAAAA 
GTAAAAAAGG ATAGCTCTCC TGCTGATATT CTTGACGAGT TAACTGAATT AACTGAACTA 
GCGAAAAGTG TTACAAAAAA TGACGTGGAT GGTTTTGAAT TTTACCTTAA TACATTCCAT 
GATGTAATGG TGGGAAATAA TTTATTCGGT CGTTCAGCTT TAAAAACTGC TTCGGAATTA 
ATTGCTAAAG AAAATGTGAA AACAAGTGGC AGTGAAGTAG GAAATGTTTA TAATTTCTTA 
ATTGTATTAA CAGCTCTACA AGCAAAAGCT TTTCTTACTT TAACAACATG CCGAAAATTA 
TTAGGCTTAG CAGATATTGA TTATACTTCT ATCATGAATG AGCATTTAAA TAAGGAAAAA 960 
GAGGAATTTA GAGTAAACAT CCTTCCCACA CTTTCTAATA CCTTTTCTAA TCCTAATTAT 
GCAAAAGCTA AGGGAAGTAA TGAAGATACA AAGATGATTG TGGAAGCTAA ACCAGGATAT 
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GTTTTGGTTG GATTTGAAAT GAGCAATAAT TCAATTACAG TATTAAAAGC ATATCAAGCT 114 0 

AAGCTAAAAA AAGATTATCA AATTGATAAG GATTCGTTAT CAGAAATAAT ATATAGTACG 1200 

TGATACGGAT AAATTATTAT GTCCGGATCA ATCTGAACAA TATATTATAC AAAGAACATA 1260 

GCATTTCCAA ATGAATATGT TATTACTAAA ATTGCTTTTA CTAAAAAAAT GAACAGTTTA 132 0 

AGGTATGAGG CGACAGCGAA TTTTTATGAT TCTTCTACAG GGGATATTGA TCTAAATAAG 13 80 

ACAAAAGTAG AATCAAGTGA AGCGGAGTAT AGTATGCTAA AAGCTAGTGA TGATGAAGTT 1440 

TACATGCCGC TAGGTCTTAT CAGTGAAACA TTTTTAAATC CAATTAATGG ATTTAGGCTT 1500 

GCAGTCGATG AAAATTCCAG ACTAGTAACT TTAACATGTA GATCATATTT AAGAGAGACA 1560 

TTGTTAGCGA CAGATTTAAA TAATAAAGAA ACTAAATTGA TTGTCCCACC TAATGTTTTT 1620 

ATTAGCAATA TTGTAGAGAA TGGAAATATA GAAATGGACA CCTTAGAACC ATGGAAGGCA 1680 

AATAATGAGA ATGCGAATGT AGATTATTCA GGCGGAGTGA ATGGAACTAG AGCTTTATAT 174 0 

GTTCATAAGG ATGGTGAATT CTCACATTTT ATTGGAGACA AGTTGAAATC TAAAACAGAA 1800 

TACTTGATTC GATATATTGT AAAAGGAAAA GCTTCTATTT TTTTAAAAGA TGAAAGAAAT 1860 

GAAAATTACA TTTACGAGGA TACAAATAAT AATTTAGAAG ATTATCAAAC TATTACTAAA 1920 

CGTTTTACTA CAGGAACTGA TTCGACAGGA TTTTATTTAT TTTTTACTAC TCAAGATGGA 1980 

AATGAAGCTT GGGGAGACAC TTTTTTTCTC TAGAAAGAGG TAACTTATGA ACAAG 2035 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CATCCTCCCT ACACTTTCTA A 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 950 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



21 
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(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 49C3-ptl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

AAACTAGAGG GAGTGATAAG GATGCGAAAA TCATTATGGA AGCTAAACCT GGATATGCTT 60 

TAGTTGGATT TGAAATAAGT AAGGATTCAA TTGCAGTATT AAAAGTTTAT CAGGCAAAGC 120 

TAAAACACAA CTATCAAATT_J3ATAAGGATT CGTTATCAGA AATTGTTTAT GGTGATATAG 180 

ATAAATTATT ATGTCCGGAT CAATCTGAAC AAATGTATTA TACAAATAAA ATAGCATTTC 24 0 

CAAATGAATA TGTTATCACT AAAATTGCTT TTACTAAAAA ACTGAACAGT TTAAGATATG 300 

AGGTCACAGC GAATTTTTAT GACTCTTCTA CAGGAGATAT TGATCTAAAT AAGAAAAAAA 360 

TAGAATCAAG TGAAGCGGAG TTTAGTATGC TAAATGCTAA TAATGATGGT GTTTATATGC 420 

CGATAGGTAC TATAAGTGAA ACATTTTTGA CTCCAATTAA TGGATTTGGC CTCGTAGTCG 4 80 

ATGAAAATTC AAGACTAGTA ACTTTGACAT GTAAATCATA TTTAAGAGAG ACATTGTTAG 540 

CAACAGACTT AAGTAATAAA GAAACTAAAC TGATTGTCCC ACCTAATGGT TTTATTAGCA 600 

ATATTGTAGA AAATGGGAAC TTAGAGGGAG AAAACTTAGA GCCGTGGGAA AGCAAATAAC 660 

AAAAATGCGT ATGTAGATCA TACCGGAGGT GTAAATGGAA CTAAAGTTTT ATATGTTCAT 72 0 

GAGGATGGTG AGTTCTCACA ATTTATTGGG GATAAATTGA AATTGAAAAC AGAATATGTA 7 80 

ATTCCATATA TTGTAAAGGG GAAAGCTGCT ATTTATTTAA AAGATGAAAA AAATGGGGAT 84 0 

TACATATCAT GAAGAAACAT CATAATGCAA TTGAAGATTT TTCCAGCTGT AACTTCAATA 900 

ATGATTTTCG CATCCTTATC ATCCCTCTAG CTTTTTCATA ATAGGATAGA 950 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
AAATTATGCG CTAAGTCTGC 20 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairB 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION": SEQ ID NO: 14: 
TTGATCCGGA CATAATAAT 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 176 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 4 9C6-ptl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GTAAATTATG CGCTAAGTCT GCACCTTTTT TCACTGTTAC TAAACATCAC TTTTCCTATA 60 

TCCCCTTAGC TCTTATGGAT TATTGAGCAA ACTTATCTTG TTAATTACTA CTCCCCATCA 120 

TATGCTAAAC AAAAACCAAA CAAACATTAT CTATTATATG TCCGGATCAA AATGTA 176 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairB 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGRTTAMTTG GRTAYTATTT 



(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 17: 
ATATCKWAYA TTKGCATTTA 20 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 10E1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IB: 

TGGGATTACT TGGATATTAT TTCCAGGATC AAAAGTTTCA GCAACTTGCT TTGATGGCAC 60 

ATAGACAAGC TTCTGATTTG GAAATCCCGA AAGATGACGT GAAACAGTTA CTATCCAAGG 120 

AGCAGCAACA CATTCAATCT GTTAGATGGC TTGGCTATAT TCAGCCACCT CAAACAGGAG 180 

ACTATGTATT GTCAACCTCA TCCGACCAAC AGGTCGTGAT TGAACTCGAT GGAAAAACCA 24 0 

TTGTCAATCA AACTTCTATG ACAGAACCGA TTCAACTCGA AAAAGATAAG CTCTATAAAA 3 00 

TTAGAATTGA ATATGTCCCA GAAGATACAA AAGAACAAGA GAACCTCCTT GACTTTCAGC 360 

TCAACTGGTC GATTTCAGGA TCAGAGATAG AACCAATTCC GGAGAATGCT TTCCATTTAC 420 

CAAATTTTTC TCGTAAACAA GATCAAGAGA AAATCATCCC TGAAACCAGT TTGTTTCAGG 4 80 

AACAAGGAGA TGAGAAAAAA GTATCTCGCA GTAAGAGATC TTTAGCTACA AATCCTATCC 54 0 

GTGATACAGA TGATGATAGT ATTTATGATG AATGGGAAAC GGAAGGATAC ACGATACGGG 600 

AACAAATAGC AGTGAAATGG GACGATTCTA TGAAGGATAG AGGTTATACC AAATATGTGT 660 

CAAACCCCTA TAAGTCTCAT ACAGTAGGAG ATCCATACAC AGATTGGGAA AAAGCGGCTG 720 

GCCGTATCGA TAACGGTGTC AAAGCAGAAG CCAGAAATCC TTTAGTCGCG GCCTATCCAA 780 

CTGTTGGTGT ACATATGGAA AGATTAATTG TCTCCGAAAA ACAAAATATA TCAACAGGGC 84 0 
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TTGGAAAAAC TGTATCTGCG TCTATGTCCG CAAGCAATAC CGCAGCGATT ACGGCAGGTA 900 

TTGATGCAAC AGCCGGTGCC TCTTTACTCG GGCCATCTGG AAGTGTCACG GCTCATTTTT 960 

CTTATACAGG ATCTAGTACA TCCACCGTTG AAGATAGCTC CAGCCGGAAT TGGAGTCAAG 102 0 

ACCTTGGGAT CGATACGGGA CAATCTGCAT ATTTAAATGC CAAATGTACG ATATAA 1076 

(2) INFORMATION FOR SEQ IDJJO:19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 10E1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Gly Leu Leu Gly Tyr Tyr Phe Gin Asp Gin Lys Phe Gin Gin Leu Ala 
X 5 10 15 

Leu Met Ala His Arg Gin Ala Ser Asp Leu Glu lie Pro Lys Asp Asp 
20 25 30 

Val Lys Gin Leu Leu Ser Lys Glu Gin Gin His He Gin Ser Val Arg 
35 40 45 

Trp Leu Gly Tyr He Gin Pro Pro Gin Thr Gly Asp Tyr Val Leu Ser 
50 55 60 

Thr Ser Ser Asp Gin Gin Val Val He Glu Leu Asp Gly Lys Thr He 
65 70 75 80 

Val Asn Gin Thr Ser Met Thr Glu Pro He Gin Leu Glu Lys Asp Lys 
85 " 90 95 

Leu Tyr Lys He Arg He Glu Tyr Val Pro Glu Asp Thr Lys Glu Gin 
100 105 110 

Glu Asn Leu Leu Asp Phe Gin Leu Asn Trp Ser He Ser Gly Ser Glu 
115 120 125 

He Glu Pro He Pro Glu Asn Ala Phe His Leu Pro Asn Phe Ser Arg 
130 135 140 

Lys Gin Asp Gin Glu Lys He He Pro Glu Thr Ser Leu Phe Gin Glu 
145 150 155 160 
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Gin Gly Asp Glu Lys Lys Val Ser Arg Ser Lys Arg Ser Leu Ala Thr 
165 170 175 

Asn Pro lie Arg Asp Thr Asp Asp Asp Ser lie Tyr Asp Glu Trp Glu 
180 185 190 

Thr Glu Gly Tyr Thr He Arg Glu Gin He Ala Val Lys Trp Asp Asp 
195 200 205 

Ser Met Lys Asp Arg Gly Tyr Thr Lys Tyr Val Ser Asn Pro^Tyr Lys 
210 215 220 

Ser His Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gly 
225 230 235 240 

Arg He Asp Asn Gly Val Lys Ala Glu Ala Arg Asn Pro Leu Val Ala 
245 250 255 

Ala Tyr Pro Thr Val Gly Val His Met Glu Arg Leu He Val Ser Glu 
260 265 270 

Lys Gin Asn He Ser Thr Gly Leu Gly Lys Thr Val Ser Ala Ser Met 
275 2B0 285 

Ser Ala Ser Asn Thr Ala Ala He Thr Ala Gly He Asp Ala Thr Ala 
290 295 300 

Gly Ala Ser Leu Leu Gly Pro Ser Gly Ser Val Thr Ala His Phe Ser 
305 310 315 320 

Tyr Thr Gly Ser Ser Thr Ser Thr Val Glu Asp Ser Ser Ser Arg Asn 

3.2J 330 335 

Trp Ser Gin Asp Leu Gly He Asp Thr Gly Gin Ser Ala Tyr Leu Asn 
340 345 350 

Ala Lye Cys Thr He 
355 



INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 31J2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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TGGGTTACTT GGGTATTATT TTAAAGGAAA AGATTTTAAT AATCTTACTA TATTTGCTCC 6 0 

AACACGTGAG AATACTCTTA TTTATGATTT AGAAACAGCG AATTCTTTAT TAGATAAGCA 120 

ACAACAAACC TATCAATCTA TTCGTTGGAT CGGTTTAATA AAAAGCAAAA AAGCTGGAGA 180 

TTTTACCTTT CAATTATCGG ATGATGAGCA TGCTATTATA GAAATCGATG GGAAAGTTAT 24 0 

TTCGCAAAAA GGCCAAAAGA AACAAGTTGT TCATTTAGAA AAAGATAAAT TAGTTCCCAT 300 

CAAAATTGAA^TATCAATCTG"ATAAAGCGTT AAACCCAGAT AGTCAAATGT TTAAAGAATT, 360 

GAAATTATTT AAAATAAATA GTCAAAAACA ATCTCAGCAA GTGCAACAAG ACGAATTGAG 420 

AAATCCTGAA TTTGGTAAAG AAAAAACTCA AACATATTTA AAGAAAGCAT CGAAAAGCAG 4 80 

CTTGTTTAGC AATAAAAGTA AACGAGATAT AGATGAAGAT ATAGATGAGG ATACAGATAC 54 0 

AGATGGAGAT GCCATTCCTG ATGTATGGGA AGAAAATGGG TATACCATCA AAGGAAGAGT 600 

AGCTGTTAAA TGGGACGAAG GATTAGCTGA TAAGGGATAT AAAAAGTTTG TTTCCAATCC 660 

TTTTAGACAG CACACTGCTG GTGACCCCTA TAGTGACTAT GAAAAGGCAT CAAAAGATTT 72 0 

GGATTTATCT AATGCAAAAG AAACATTTAA TCCATTGGTG GCTGCTTTTC CAAGTGTCAA 780 

TGTTAGCTTG GAAAATGTCA CCATATCAAA AGATGAAAAT AAAACTGCTG AAATTGCGTC 84 0 
TACTTCATCG AATAATTGGT CCTATACAAA TACAGAGGGG GCATCTATTG AAGCTGGAAT 
TGGACCAGAA GGTTTGTTGT CTTTTGGAGT AAGTGCCAAT TATCAACATT CTGAAACAGT 

GGCCAAAGAG TGGGGTACAA CTAAGGGAGA CGCAACACAA TATAATACAG CTTCAGCAGG 1020 
ATATCTAAAT GCCAATGTAC GATAT 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

<C) INDIVIDUAL ISOLATE: 31J2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 



900 
960 



1045 



1 



5 10 15 
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lie Phe Ala Pro Thr Arg Glu Asn Thr Leu lie Tyr Asp Leu Glu Thr 



25 30 



20 

Ala Asn Ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr Gin Ser lie Arg 
35 40 45 

Trp lie Gly Leu lie Lys Ser Lys Lys Ala Gly Asp Phe Thr Phe Gin 



50 



55 60 



Leu 
65 



Ser Asp Asp Glu His Ala lie lie Glu lie Asp Gly Lys Val lie 



70 80 



Ser Gin Lys Gly Gin Lys Lys Gin Val Val His Leu Glu Lys Asp Lys 
85 90 95 

Leu Val Pro lie Lys lie Glu Tyr Gin Ser Asp Lys Ala Leu Asn Pro 
100 105 U0 

Asp Ser Gin Met Phe Lys Glu Leu Lys Leu Phe Lys He Asn Ser Gin 
115 120 125 

Lys Gin Ser Gin Gin Val Gin Gin Asp Glu Leu Arg Asn Pro Glu Phe 
130 135 "0 

Gly Lys Glu Lys Thr Gin Thr Tyr Leu Lys Lys Ala Ser Lys Ser Ser 



145 



150 155 I 60 



Leu Phe Ser Asn Lys Ser Lys Arg Asp lie Asp Glu Asp lie Asp Glu 
165 170 "5 

Asp Thr Asp Thr Asp Gly Asp Ala lie Pro Asp Val Tr P Glu Glu Asn 
180 I" I 90 

Gly Tyr Thr He Lys Gly Arg Val Ala Val Lys Trp Asp Glu Gly Leu 
195 200 205 

Ala Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gin His 
210 215 220 

Thr Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala Ser Lys Asp Leu 
22 jT" 230 235 240 

Asp Leu Ser Asn Ala Lys Glu Thr Phe Asn Pro Leu Val Ala Ala Phe 
* 245 250 255 

Pro Ser Val Asn Val Ser Leu Glu Asn Val Thr lie Ser Lys Asp Glu 
260 265 270 

Asn Lys Thr Ala Glu He Ala Ser Thr Ser Ser Asn Asn Trp Ser Tyr 
275 280 285 

Thr Asn Thr Glu Gly Ala Ser lie Glu Ala Gly He Gly Pro Glu Gly 
290 295 300 
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Leu Leu Ser Phe Gly Val Ser Ala Asn Tyr Gin His Ser Glu Thr Val 
305 310 315 320 

Ala Lys Glu Trp Gly Thr Thr Lys Gly Asp Ala Thr Gin Tyr Asn Thr 
325 330 335 

Ala Ser Ala Gly Tyr Leu Asn Ala Asn Val Arg Tyr 
340 345 



{2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1641 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 33D2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CCAAAGGGGG NTTAAACCNG GANGGTTNNN TNNTTNNTTN TNGAANCCCA NTTGGAAACC 60 

CNATNAAATT CNTGGTTANT GGTNGTGAGT GNNTNTTTTA NCNGAGNTTG CCCNTTTGNN 120 

TACCNGGATT TNAAGGCAGA ANTTNTTNNT NGCTNNTTAA AGGTTNTGNT TNTNANTGAA 180 

TTTTTTNGGN TTTGCCCAAA AAACAAGGAT GAATCCTGTT ATTCCNCCCT NGAAAAAATN 24 0 

GAAACGGAAC AACGTGAGTA TGATAAACAT CTTTTACAAA CTGCGACATC TTGTTGAAAA 3 00 

TGCCTTTTTT GAAAANNTAA AAGGTTTCGT GGCATTGCCA CACGTTATAC AAAAACCACG 3 60 

TCTGCTTTTA GAGGGGCTGT TACCTTGGCT GCTATTTCTC TGTGGTTGAA TCTCGTATAG 42 0 

ACACTATCTA GTCTATACAT CTTATCTTTT CATCATGATT CCAGTCGTAC ATTTACTCAA 4 80 

AAATAGAAAG GATGACCCCT ATGCAATTAA AAAATGTATA CAAATGTTTA ACCATTACAG 540 

CGCTTTTGGC TCAAATCGCC GCCTTCCCGT CTTCCTCTTT TGCGGAAGAC GGGAAGAAAA 600 

AAGAAGAAAA TACAGCTAAA ACAGAACATC AACAGAAAAA AGAAACAAAA CCAGTTGTGG 660 

GATTAATTGG TCACTATTTT ACTGATGATC AGTTTACTAA CACAGCATTT ATTCAAGTAG 720 

GAGAAAAAAG TAAATTACTA GATTCAAAAA TAGTAAAGCA AGATATGTCC AATTTGAAAT 780 

CCATTCGATG GGAAGGAAAT GTGAAACCTC CTGAAACAGG AGAATATCTA CTTTCCACGT 840 

CCTCTAATGA AAATGTTACA GTAAAAGTAG ATGGAGAAAC TGTTATTAAC AAAGCTAACA 900 
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1140 
1200 
1260- 
1320 



1500 
1560 
1620 
1641 
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TGGAAAAAGC AATGAAACTC GAAAAAGATA AACCACACTC TATTGAAATT GAATATCATG 
TTCCTGAGAA CGGGAAGGAA CTACAATTAT TTTGGCAAAT AAATGACCAG AAAGCTGTTA 
AAATCCCAGA AAAAAACATA CTATCACCAA ATCTTTCTGA ACAGATACAA CCGCAACAGC 
GTTCAACTCA ATCTCAACAA AATCAAAATG ATAGGGATGG GGATAAAATC CCTGATAGTT 
TAGAAGAAAA TGGCTATACA TTTAAAGACG GTGCGATTGT TGCCTGGAAC GATTCCTATG 
CAGCACTAGG CTATAAAAAA TACATATCCA ATTCTAATAA GGCTAAAACA GCTGCTGACC 
CCTATACGGA CTTTGAAAAA GTAACAGGAC ACATGCCGGA GGCAACTAAA GATGAAGTAA 
AAGATCCACT AGTAGCCGCT TATCCCTCGG TAGGTGTTGC TATGGAAAAA TTTCATTTTT 1380 
CTAGAAATGA AACGGTCACT GAAGGAGACT CAGGTACTGT TTCAAAAACC GTAACCAATA 144 0 
CAAGCACAAC AACAAATAGC ATCGATGTTG GGGGATCCAT TGGATGGGGA GAAAAAGGAT 

TTTcrrrrrc attctctccc aaatatacgc attcttggag taatagtacc gctgttgctg 

ATACTGAAAG TAGCACATGG TCTTCACAAT TAGCGTATAA TCCTTCAGAA CGTGCTTTCT 
TAAATGCCAA TATACGATAT a 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNBSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 3 3D2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Gly Leu lie Gly His Tyr Phe Thr Asp Asp Gin Phe Thr Asn Thr Ala 
1 5 10 15 

Phe He Gin Val Gly Glu Lys Ser Lys Leu Leu Asp Ser Lys lie Val 
20 25 30 

Lys Gin Asp Met Ser Asn Leu Lys Ser lie Arg Trp Glu Gly Asn Val 
35 40 45 

Lys Pro Pro Glu Thr Gly Glu Tyr Leu Leu Ser Thr Ser Ser Asn Glu 
50 55 60 
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Aen Val Thr Val Lys Val Asp Oly Glu Thr Val lie Asn Lys Ala Asn 
65 70 " 

Met Olu Lye Ala Met Lys Leu Glu Lys Asp Lys Pro His Ser lie Glu 
B5 90 

ne Glu Tyr His Val Pro Glu Asn oly Lys Glu Leu Gin Leu Phe Trp 
100 105 



- lr , TVB Ala val Lys lie Pro Glu Lys Asn lie Leu 
Gin He Asn Asp Gin Lys Aia_v.ai l Y& ^ ^ 

115 — — 120 
SeTpro Asn Leu Ser Glu Gin He Gin Pro Gin Gin Ar 9 Ser Thr Gin 

130 135 
Ser Gin Gin Asn Gin Asn Asp Arg Asp Oly Asp Lys He Pro Asp Ser 
145 150 155 

Le u Glu Glu Asn Gly Tyr Thr Phe Lys Asp Oly Ala lie val Ala Trp 
165 170 

As n Asp Ser Tyr Ala Ala Leu Gly Tyr Lys Lys Tyr He Ser Asn Ser 

1B0 185 
» W. «• «• *» U. ». S « M p Ph. « L„ «1 

Thr «, Hi. Met Pro M. M. Tnr Ly. MP «. vaj W «P «« - 

210 215 

,,„i riv val Ala Met Glu Lys Phe His Phe 
Val Ala Ala Tyr Pro Ser Val Gly Val Ala nee ^ 

225 230 " 

ser Are Asn Glu Thr Val Thr Glu Gly Jjp Ser Gly Thr Val Ser Lys 
245 250 

-w Thr Thr Thr Asn Ser He Asp Val Gly Gly 

Thr Val Thr Asn Thr Ser Thr Thr Tnr ^ 



260 



ser lie Gly Trp Gly Glu Lys Gly Phe Ser Phe Ser Phe Ser-Pro Lys 

275 280 
Tyr Thr His Ser Trp Ser Asn Ser Thr Ala Val Ala Asp Thr Glu Ser 
290 295 

, » o a Tvr Asn Pro Ser Glu Arg Ala Phe 
Ser Thr Trp Ser Ser Gin Leu Ala Tyr Asn rro ^ 

305 310 

Leu Asn Ala Asn He Arg Tyr 
325 



(2) INFORMATION FOR SBQ ID NO:24: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1042 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 66D3 



(xi)-SEQUENCE DESCRIPTION: SEQJD NO: 24: 
TTAATTGGQT ACTATTTTAA AGGAAAAGAT TTTAATAATC TTACTATATT TGCTCCAACA 
CGTGAGAATA CTCTTATTTA TGATTTAGAA ACAGCGAATT CTTTATTAGA TAAGCAACAA 
CAAACCTATC AATCTATTCG TTGGATCGGT TTAATAAAAA GCAAAAAAGC TGGAGATTTT 
ACCTTTCAAT TATCGGATGA TGAGCATGCT ATTATAGAAA TCGATGGGAA AGTTATTTCG 
CAAAAAGGCC AAAAGAAACA AGTTGTTCAT TTAGAAAAAG ATAAATTAGT TCCCATCAAA 
ATTGAATATC AATCTGATAA AGCGTTAAAC CCAGATAGTC AAATGTTTAA AGAATTGAAA 
TTATTTAAAA TAAATAGTCA AAAACAATCT CAGCAAGTGC AACAAGACGA ATTGAGAAAT 
CCTGAATTTG GTAAAGAAAA AACTCAAACA TATTTAAAGA AAGCATCGAA AAGCAGCCTG 
TTTAGCAATA AAAGTAAACG AGATATAGAT GAAGATATAG ATGAGGATAC AGATACAGAT 
GGAGATGCCA TTCCTGATGT ATGGGAAGAA AATGGGTATA CCATCAAAGG AAGAGTAGCT 
GTTAAATGGG ACGAAGGATT AGCTGATAAG GGATATAAAA AGTTTGTTTC CAATCCTTTT 
AGACAGCACA CTGCTGGTGA CCCCTATAGT GACTATGAAA AGGCATCAAA AGATTTGGAT 
TTATCTAATG CAAAAGAAAC ATTTAATCCA TTGGTGGCTG CTTTTCCAAG TGTCAATGTT 
AGCTTGGAAA ATGTCACCAT ATCAAAAGAT GAAAATAAAA CTGCTGAAAT TGCGTCTACT 
TCATCGAATA ATTGGTCCTA TACAAATACA GAGGGGGCAT CTATTGAAGC TGGAATTGGA 
CCAGAAGGTT TGTTGTCTTT TGGAGTAAGT GCCAATTATC AACATTCTGA AACAGTGGCC 
AAAGAGTGGG GTACAACTAA GGGAGACGCA ACACAATATA ATACAGCTTC AGCAGGATAT 
CTAAATGCCA ATGTACGATA TA 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 347 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1042 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 66D3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Leu lie Gly Tyr Tyr Phe Lys Gly Lys Asp Phe_Asn Asn Leu Thr lie 
i ^ 10 

Phe Ala Pro Thr Arg Glu Asn Thr Leu lie Tyr Asp Leu Glu Thr Ala 

20 25 
A8 n Ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr Gin Set He Arg Trp 

35 40 " 

He Gly Leu He Lys Ser Lys Lys Ala Gly Asp Phe Thr Phe Gin Leu 
50 55 

. »«„ n,i His Ala He He Glu He Asp Gly Lys Val He Ser 
Ser Asp Asp Glu His Aia " «- 

65 70 " 

Gin Lys Gly Gin Lys Lys Gin Val Val His Leu Glu Lys Asp Lys Leu 
85 90 

val Pro He Lys He Glu Tyr Gin Ser Asp Lys Ala Leu Asn Pro Asp 

100 105 
ser Gin Met Phe Lys Glu Leu Lys Leu Phe Lys lie Asn Ser Gin Lys 
115 - 120 

' cm Ser Gin Gin Val Gin Gin Asp Glu Leu Arg Asn Pro Glu Phe Gly 

1 -j c 1 ^ U 

130 1Jb 
hyB Glu Lys Thr Gin Thr Tyr Leu Lys Lys Ala Ser Lys Ser Ser Leu 
14S 150 " 

Phe Ser Asn Lys Ser Lys Arg Asp He Asp Glu Asp He Asp Glu Asp 
165 170 

Thr Asp Thr Asp Gly Asp Ala He Pro Asp Val Trp Glu Glu Asn Gly 
180 185 

Tyr Thr He Lys Gly Arg Val Ala Val Lys Trp Asp Glu Gly Leu Ala 

y 195 200 205 

Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gin His Thr 
210 215 

A la Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala Ser Lys Asp Leu Asp 



030 235 
225 230 
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Leu Ser Asn Ala Lya Glu Thr Phe Asn Pro Leu Val Ala Ala Phe Pro 
24S 250 255 

Ser Val Asn Val Ser Leu Glu Asn Val Thr He Ser Lys Asp Glu Asn 
260 265 270 

Lys Thr Ala Glu lie Ala Ser Thr Ser Ser Asn Asn Trp Ser Tyr Thr 
275 280 285 

Asn Thr Glu Gly Ala Ser lie Glu Ala Gly lie Gly Pro Glu Gly Leu 
— 290 - 295 300 

Leu Ser Phe Gly Val Ser Ala Asn Tyr Gin His Ser Glu Thr Val Ala 
305 310 315 320 

Lys Glu Trp Gly Thr Thr Lys Gly Asp Ala Thr Gin Tyr Asn Thr Ala 
325 330 335 

Ser Ala Gly Tyr Leu Asn Ala Asn Val Arg Tyr 
340 345 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1278 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi> ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 68F 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0-.26: 

TGGATTACTT GGGTACTATT TTAAAGGGAA AGATTTTAAT GATCTTACTG TATTTGCACC 60 

AACGCGTGGG AATACTCTTG TATATGATCA ACAAACAGCA AATACATTAC TAAATCAAAA 120 

ACAACAAGAC TTTCAGTCTA TTCGTTGGGT TGGTTTAATT CAAAGTAAAG AAGCAGGCGA 180 

TTTTACATTT AACTTATCAG ATGATGAACA TACGATGATA GAAATCGATG GGAAAGTTAT 240 

TTCTAATAAA GGGAAAGAAA AACAAGTTGT CCATTTAGAA AAAGGACAGT TCGTTTCTAT 300 

CAAAATAGAA TATCAAGCTG ATGAACCATT TAATGCGGAT AGTCAAACCT TTAAAAATTT 360 

GAAACTCTTT AAAGTAGATA CTAAGCAACA GTCCCAGCAA ATTCAACTAG ATGAATTAAG 420 

AAACCCTGAA TTTAATAAAA AAGAAACACA AGAATTTCTA ACAAAAGCAA CAAAAACAAA 480 

CCTTATTACT CAAAAAGTGA AGAGTACTAG GGATGAAGAC ACGGATACAG ATGGAGATTC 540 
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TATTCCAGAC ATTTGGGAAG AAAATGGGTA TACCATCCAA AATAAGATTG CCGTCAAATG 
GGATGATTCA TTAGCAAGTA AAGGATATAC GAAATTTGTT TCAAACCCAC TAGATACTCA 660 
CACGGTTGGA GATCCTTATA CAGATTATGA AAAAGCAGCA AGGGATTTAG ATTTGTCAAA 72 0 

TGCAAAAGAA ACATTTAACC CATTAGTTGC GGCTTTTCCA AGTGTGAATG TGAGTATGGA 
AAAAGTGATA TTGTCTCCAG ATGAGAACTT ATCAAATAGT ATCGAGTCTC ATTCATCTAC 
GAATTGGTCG TATACGAATA CAGAAGGGGCJTTCTATTGAA GCTGGTGGGG GAGCATTAGG 
CCTATCTTTT GGTGTAAGTG CAAACTATCA ACATTCTGAA ACAGTTGGGT ATGAATGGGG 
AACATCTACG GGAAATACTT CGCAATTTAA TACAGCTTCA GCGGGGTATT TAAATGCGAA 
TGTTCGCTAC AATAACGTGG GAACGGGTGC AATCTATGAT GTAAAGCCAA CAACGAGTTT 1080 
TGTATTAAAT AAAGATACCA TCGCAACGAT AACAGCAAAA TCGAATACGA CTGCATTAAG 114 0 
TATCTCACCA GGACAAAGTT ATCCGAAACA AGGTCAAAAT GGAATCGCGA TCACATCGAT 1200 
GGATGATTTT AACTCACATC CGATTACATT GAATAAGCAA CAGGTAGGTC AACTGTTAAA 1260 
TAATACCCAA TTAATCCA 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 5 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 68F 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asp Leu Thr 

I 5 10 15 

Val Phe Ala Pro Thr Arg Gly Asn Thr Leu Val Tyr Asp Gin Gin Thr 
20 25 30 

Ala Asn Thr Leu Leu Asn Gin Lys Gin Gin Asp Phe Gin Ser lie Arg 
35 40 45 

Trp Val Gly Leu He Gin Ser Lys Glu Ala Gly Asp Phe Thr Phe Asn 
50 * 55 60 
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„, u .„ Thr Met lie Glu lie Asp Gly Lys Val He 
Leu Ser Asp Ab P Glu Hie Thr Met 8Q 

65 70 

Ser Asn L ys Gly Lys Glu Lys Gin Val Val His Leu Glu Lys Gly Gin 
85 y 

, o tt t,v 3 lie Glu Tyr Gin Ala Asp Glu Pro Phe Asn Ala 
Phe Val Ser He Lys lie uiu iy ^ 

100 lu:> 



, 6P ser Gin Thr Phe Lys Asn Leu Lys - Ph^ys Val Asp, Thr Lys 

G1 „ Cln Ser Gin Gin He Gin Leu Asp Glu Leu Ar ? Asn Pro Glu Phe 

130 135 
Mn Lys Lys Glu Thr Gin Glu Phe Leu Thr Lys Ala Thr Lys Thr Asn 
145 "° 

Leu Xle Thr Gin Lys Val Lys Ser Thr £ Asp Glu Asp Thr Asp Thr 
165 170 

c tt Pro asp He Trp Glu Glu Asn Gly Tyr Thr He 
Asp Gly Asp Ser He Pro Asp ne v JM 

180 185 
01 „ „ £ n- «a v.l w. Trp MP MP ser u. U. S.r w «y 

wr Tir " Ph. V,! S« » » - -» « £ * « «» 
210 215 

m Tyr T»r «P Tyr «. W «• »• «• £ - "> - £ 
230 

r. «. «. ~ - - « - v * j s ua phe pr ° Mt ^ 

245 250 
Val Ser Met Glu Lys Val He Leu Ser Pro Asp Glu Asn Leu Ser Asn 
260 265 



ser ne Glu ser His Ser Ser^hr Asn Trp Ser Tyr Thr Asn Thr Clu 

Gly Ala ser He Glu Ala Gly Gly Gly Ala Leu Gly Leu Ser Phe Gly 
290 295 

, » tw n„ His ser Glu Thr Val Gly Tyr Glu Trp Gly 
Val Ser Ala Asn Tyr Gin His s»er m 32Q 

305 310 315 

Th r ser Thr Gly Asn Thr Ser Gin Phe Asn Thr Ala Ser Ala Gly Tyr 

325 330 
Leu Asn Ala Asn Val Arg Tyr Asn Asn Val Gly Thr Gly Ala He Tyr 

340 34b 
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Asp val Lye Pro Thr Thr Ser Phe Val Leu Asn Lys Asp Thr lie Ala 
355 360 365 

Thr He Thr Ala Lys Ser Asn Thr Thr Ala Leu Ser He Ser Pro Gly 
370 375 380 

Gin Ser Tyr Pro Lys Gin Gly Gin Asn Gly lie Ala He Thr Ser Met 
385 390 395 400 

Asp Asp Phe Asn Ser His Pro He Thr Leu Asn Lys Gin Gin Val Gly 

p Tos no 415 

Gin Leu Leu Asn Asn Thr Gin Leu He 
420 425 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 983 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Bingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 69AA2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

TGGATTACTT GGGTACTATT TTACTGATGA TCAGTTTACT AACACAGCAT TTATTCAAGT 

AGGAGAAAAA AGTAAATTAC TAGATTCAAA AATAGTAAAA CAAGATATGT CCAATTTGAA 

ATCCATTCGA TGGGAAGGAA ATGTGAAACC TCCTGAAACA GGAGAATATC TACTTTCCAC 

GTCCTCTAAT GAAAATGTTA CAGTAAAAGT AGATGGAGAA ACTGTTATTA ACAAAGCTAA 

CATGGAAAAA GCAATGAAAC TCGAAAAAGA TAAACCACAC TCTATTGAAA TTGAATATCA 

TGTTCCTGAG AACGGGAAGG AACTACAATT ATTTTGGCAA ATAAATGACC AGAAAGCTGT 

TAAAATCCCA GAAAAAAACA TACTATCACC AAATCTTTCT GAACAGATAC AACCGCAACA 

GCGTTCAACT CAATCTCAAC AAAATCAAAA TGATAGGGAT GGGGATAAAA TCCCTGATAG 

TTTAGAAGAA AATGGCTATA CATTTAAAGA CGGTGCGATT GTTGCCTGGA ACGATTCCTA 

TGCAGCACTA GGCTATAAAA AATACATATC CAATTCTAAT AAGGCTAAAA CAGCTGCTGA 

CCCCTATACG GACTTTGAAA AAGTAACAGG ACACATGCCG GAGGCAACTA AAGATGAAGT 

AAAAGATCCA CTAGTAGCCG CTTATCCCTC GGTAGGTGTT GCTATGGAAA AATTTCATTT 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 



960 
983 



PCT/US97/19804 

WO 98/18932 

TTCTAGAAAT GAAACGGTCA CTGAAGGAGA CTCAGGTACT GTTTCAAAAA CCGTAACCAA 780 
TACAAGCACA ACAACAAATA GCATCGATGT TGGGGGATCC ATTGGATGGG GAGAAAAAGG 640 
ATTTTCTTTT TCATTCTCTC CCAAATATAC GCATTCTTGG AGTAATAGTA CCGCTGTTGC 900 
TGATACTGAA AGTAGCACAT GGTCTTCACA ATTAGCGTAT AATCCTTCAG AACGTGCTNT 
CTTAAATGCC AATAKACGAT NTA 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 327 amino acid9 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

<C) INDIVIDUAL ISOLATE: 69AA2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Gly Leu Leu Gly Tyr Tyr Phe Thr Asp Asp Gin Phe Thr Asn Thr Ala 
X 5 io 15 

Phe He Gin Val Gly Glu Lys Ser Lys Leu Leu Asp Ser Lys He Val 
20 25 30 

Lys Gin Asp Met Ser Asn Leu Lys Ser lie Arg Trp Glu Gly Asn Val 
35 40 45 

Lvb Pro Pro Glu Thr Gly Glu Tyr Leu Leu Ser Thr Ser Ser Asn Glu 
50 55 60 

Asn Val Thr Val Lys Val Asp Gly Glu Thr Val He Asn Lys Ala Asn 
65 70 75 80 

Met Glu Lys Ala Met Lys Leu Glu Lys Asp Lys Pro His Ser lie Glu 
85 90 95 

He Glu Tyr His Val Pro Glu Asn Gly Lys Glu Leu Gin Leu Phe Trp 
100 105 HO 

Gin He Asn Asp Gin Lys Ala Val Lys He Pro Glu Lys Asn He Leu 
H5 120 125 

Ser Pro Asn Leu Ser Glu Gin lie Gin Pro Gin Gin Arg Ser Thr Gin 



130 



135 140 



Ser Gin Gin Asn Gin Asn Asp Arg Asp Gly Asp Lys He Pro Asp Ser 
145 150 I 55 160 
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Leu Glu Glu Asn Gly Tyr Thr Phe Lys Asp Gly Ala He Val Ala Trp 
165 170 175 

Asn Asp Ser Tyr Ala Ala Leu Gly Tyr Lys Lys Tyr lie Ser Asn Ser 
180 185 



Asn Lys Ala Lys Thr Ala Ala Asp Pro Tyr Thr Asp Phe Glu Lys Val 

195 200 205 

Thr Gly His Met Pro Glu Ala Thr Lys Asp Glu Val Lys Asp Pro Leu 
210 215 220 

Val Ala Ala Tyr Pro Ser VaTciy Val Ala Met Glu Lys Phe His Phe 
225 230 235 

Ser Arg Asn Glu Thr Val Thr Glu Gly Asp Ser Gly Thr Val Ser Lye 

245 250 2 " 

Thr Val Thr Asn Thr Ser Thr Thr Thr Asn Ser lie Asp Val Gly Gly 
260 265 

Ser lie Gly Trp Gly Glu Lys Gly Phe Ser Phe Ser Phe Ser Pro Lys 

275 280 285 

Ty r Thr His Ser Trp Ser Asn Ser Thr Ala Val Ala Asp Thr Glu Ser 
290 

ser Thr Trp Ser Ser Gin Leu Ala Tyr A sn Pro Ser Glu Arg Ala Xaa 



305 310 

Leu Asn Ala Asn Xaa Arg Xaa 
325 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1075 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 16801 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

TGGGTTAATT GGATATTATT TCCAGGATCA AAAATTTCAA CAACTCGCTT TAATGGTACA 

TAGGCAAGCT TCTGATTTAA AAATACTGAA AGATGACGTG AAACATTTAC TATCCGAAGA 

TCAACAACAC ATTCAATCAG TAAGGTGGAT AGGCTATATT AAGCCACCTA AAACAGGAGA 



60 
120 
180 
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840 
900 
960 
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CTACGTATTG TCAACCTCAT CCGACCAACA GGTCATGATT GAACTAGATG GTAAAGTCAT 
TCTCAATCAG GCTTCTATGA CAGAACCTGT TCAACTTGAA AAAGATAAAC CGTATAAAAT 
TAAAATTGAA TATGTTCCGG AACAAACAGA AACACAAGAT ACGCTTCTTG ATTTTAAACT 
GAACTGGTCT TTTTCAGGCG GAAAAACAGA AACGATTCCA GAAAATGCAT TTCTATTACC 
AGACCTTTCT CGTAAACAAG ATCAAGAAAA GCTTATTCCT GAGGCAAGTT TATTTCAGAA 
ACCTGGAGAC GAGAAAAAAA TATCTCGAAG-TAAACGGTCC TTTAACTACA GATTCTCTAT 
ATGATACAAG ATGATGATGG GATTTCGGAT GCGTGGGAAA CAGAAGGATA CACGATACAA 
AGACAACTGG CAGTGAAATG GGACGATTCT ATGAAGGATC GAGGGTATAC CAAATATGTA 
TCTAATCCCT ATAATTCCCA TACAGTAGGG GATCCATACA CAGATTGGGA AAAAGCGGCT 720 
GGACGTATTG ATAAGGCGAT CAAAGGAGAA GCTAGGAATC CTTTAGTCGC GGCCTATCCA 780 
ACCGTTGGTG TACATATGGA AAAACTGATT GTCTCCGAGA AACAAAACAT ATCAACTGGA 
CTCGGAAAAA CAATATCTGC GTCAATGTCT GCAAGTAATA CCGCAGCGAT TACAGCGGGC 
ATTGATACGA CGGCTGGTGC TTCTTTACTT GGACCGTCTG GAAGCGTCAC GGCTCATTTT 
TCTGATACAG GATCCAGTAC ATCCACTGTT GAAAATAGCT CAAGTAATAA TTGGAGTCAA 1020 
GATCTTGGAA TCGATACGGG ACAATCTGCA TATTTAAATG CCAATGTACG ATATA 1075 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2645 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

{C> INDIVIDUAL ISOLATE: 177cB - vipl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATGAAGAAGA AGTTAGCAAG TGTTGTAACG TGTACGTTAT TAGCTCCTAT GTTTTTGAAT 60 

GGAAATGTGA ATGCTGTTTA CGCAGACAGC AAAACAAATC AAATTTCTAC AACACAGAAA 120 

AATCAACAGA AAGAGATGGA CCGAAAAGGA TTACTTGGGT ATTATTTCAA AGGAAAAGAT 180 

TTTAGTAATC TTACTATGTT TGCACCGACA CGTGATAGTA CTCTTATTTA TGATCAACAA 240 

ACAGCAAATA AACTATTAGA TAAAAAACAA CAAGAATATC AGTCTATTCG TTGGATTGGT 300 



PCT/US97/19804 

WO 98/18932 



80 



TTGATTCAGA GTAAAGAAAC GGGAGATTTC ACATTTAACT TATCTGAGGA TGAACAGGCA 
ATTATAGAAA TCAATGGGAA AATTATTTCT AATAAAGGGA AAGAAAAGCA AGTTGTCCAT 
TTAGAAAAAG GAAAATTAGT TCCAATCAAA ATAGAGTATC AATCAGATAC AAAATTTAAT 
ATTGACAGTA AAACATTTAA AGAACTTAAA TTATTTAAAA TAGATAGTCA AAACCAACCC 
CAGCAAGTCC AGCAAGATGA ACTGAGAAAT CCTGAATTTA ACAAGAAAGA ATCACAGGAA 
TTCTTAGCGA AACCATCGAA-AATAAATCTT VrCACTCAAA AAATGAAAAG GGAAATTGAT 

qaagacacgg atacggatgg ggactctatt cctgaccttt gggaagaaaa tgggtatacg 

ATTCAAAATA GAATCGCTGT AAAGTGGGAC GATTCTYTAG CAAGTAAAGG GTATACGAAA 
TTTGTTTCAA ATCCGCTAGA AAGTCACACA GTTGGTGATC CTTATACAGA TTATGAAAAG 
GCAGCAAGAG ACCTAGATTT GTCAAATGCA AAGGAAACGT TTAACCCATT GGTAGCTGCT 
TTTCCAAGTG TGAATGTTAG TATGGAAAAG GTGATATTAT CACCAAATGA AAATTTATCC 
AATAGTGTAG AGTCTCATTC ATCCACGAAT TGGTCTTATA CAAATACAGA AGGTGCTTCT 
GTTGAAGCGG GGATTGGACC AAAAGGTATT TCGTTCGGAG TTAGCGTAAA CTATCAACAC 
TCTGAAACAG TTGCACAAGA ATGGGGAACA TCTACAGGAA ATACTTCGCA ATTCAATACG 
GCTTCAGCGG GATATTTAAA TGCAAATGTT CGATATAACA ATGTAGGAAC TGGTGCCATC 
TACGATGTAA AACCTACAAC AAGTTTTGTA TTAAATAACG ATACTATCGC AACTATTACG 
GCGAAATCTA ATTCTACAGC CTTAAATATA TCTCCTGGAG AAAGTTACCC GAAAAAAGGA 
CAAAATGGAA TCGCAATAAC ATCAATGGAT GATTTTAATT CCCATCCGAT TACATTAAAT 
AAAAAACAAG TAGATAATCT GCTAAATAAT AAACCTATGA TGTTGGAAAC AAACCAAACA 
GATGGTGTTT ATAAGATAAA AGATACACAT GGAAATATAG TAACTGGCGG AGAATGGAAT 
GGTGTCATAC AACAAATCAA GGCTAAAACA GCGTCTATTA TTGTGGATGA TGGGGAACGT 
GTAGCAGAAA AACGTGTAGC GGCAAAAGAT TATGAAAATC CAGAAGATAA AACACCGTCT 
TTAACTTTAA AAGATGCCCT GAAGCTTTCA TATCCAGATG AAATAAAAGA AATAGAGGGA 
TTATTATATT ATAAAAACAA ACCGATATAC GAATCGAGCG TTATGACTTA CTTAGATGAA 
AATACAGCAA AAGAAGTGAC CAAACAATTA AATGATACCA CTGGGAAATT TAAAGATGTA 
AGTCATTTAT ATGATGTAAA ACTGACTCCA AAAATGAATG TTACAATCAA ATTGTCTATA 
CTTTATGATA ATGCTGAGTC TAATGATAAC TCAATTGGTA AATGGACAAA CACAAATATT 
GTTTCAGGTG GAAATAACGG AAAAAAACAA TATTCTTCTA ATAATCCGGA TGCTAATTTG 
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ACATTAAATA CAGATGCTCA AGAAAAATTA AATAAAAATC GTACTATTAT ATAAGTTTAT 2040 

ATATGAAGTC AGAAAAAAAC ACACAATGTG AGATTACTAT AGATGGGGAG ATTTATCCGA 2100 

TCACTACAAA AACAGTGAAT GTGAATAAAG ACAATTACAA AAGATTAGAT ATTATAGCTC 2160 

ATAATATAAA AAGTAATCCA ATTTCTTCAA TTCATATTAA AACGAATGAT GAAATAACTT 2220 

TATTTTGGGA TGATATTTCT ATAACAGATG TAGCATCAAT AAAACCGGAA AATTTAACAG 2280 

ATTCAGAAAT TAAACAGATT TATAGTAGGT~ATGGTATTAA GTTAGAAGAT GGAATCCTTA 23 40 

TTGATAAAAA AGGTGGGATT CATTATGGTG AATTTATTAA TGAAGCTAGT TTTAATATTG 24 00 

AACCATTGCA AAATTATGTG ACAAAATATA AAGTTACTTA TAGTAGTGAG TTAGGACAAA 24 60 

ACGTGAGTGA CACACTTGAA AGTGATAAAA TTTACAAGGA TGGGACAATT AAATTTGATT 2520 

TTACAAAATA TAGTRAAAAT GAACAAGGAT TATTTTATGA CAGTGGATTA AATTGGGACT 2 5B0 
TTAAAATTAA TGCTATTACT TATGATGGTA AAGAGATGAA TGTTTTTCAT AGATATAATA 
AATAG 



2640 
2645 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 881 amino acidB 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 177C8 - vipl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

Met Lys Lys Lys Leu Ala Ser Val Val Thr Cys Thr Leu Leu Ala Pro 
1 5 10 IS 

Met Phe Leu Asn Gly Asn Val Asn Ala Val Tyr Ala Asp Ser Lys Thr 

20 25 30 

Asn Gin He Ser Thr Thr Gin Lys Asn Gin Gin Lys Glu Met Asp Arg 
35 40 45 

Lys Gly Leu Leu Gly Tyr Tyr Phe Lys Gly Lys Asp PheSer Asn Leu 
50 55 6 ° 

Thr Met Phe Ala Pro Thr Arg Asp Ser Thr Leu He Tyr Asp Gin Gin 
65 70 75 80 
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Thr Ala Asn Lys Leu Leu Asp Lye Lys Gin Gin Glu Tyr Gin Ser lie 
85 90 95 

Arc Trp He Gly Leu He Gin Ser Lys Glu Thr Gly Asp Phe Thr Phe 
100 1° 5 110 

Asn Leu Ser Glu Asp Glu Gin Ala He lie Glu lie Asn Gly Lys lie 
115 120 125 

lie Ser Asn Lys Gly Lys Glu Lys Gin Val Val His Leu Glu Lys Gly 
130 135. 1« 

Lys Leu Val Pro He Lys He Glu Tyr Gin Ser Asp Thr Lys Phe Asn 
145 I" 155 

He Asp Ser Lys Thr Phe Lys Glu Leu Lys Leu Phe Lys He Asp Ser 
165 I 70 175 



Gin Asn Gin Pro Gin Gin Val Gin Gin Asp Glu Leu Arg Asn Pro Glu 
180 i85 190 

Phe Asn Lys Lys Glu Ser Gin Glu Phe Leu Ala Lys Pro Ser Lys He 

195 200 205 

Asn Leu Phe Thr Gin Lys Met Lys Arg Glu lie Asp Glu Asp Thr Asp 



7r c 220 
210 21b 



Thr Asp Gly Asp Ser He Pro Asp Leu Trp Glu Glu Asn Gly Tyr Thr 
225 230 " 5 

lie Gin Asn Arg lie Ala Val Lys Trp Asp Asp Ser Leu Ala Ser Lys 

Gly Tyr Thr Lys Phe Val Ser Asn Pro Leu Glu Ser His Thr Val Gly 
260 265 

Asp Pro Tyr Thr Asp Tyr Glu Lys Ala Ala Arg Asp Leu Asp Leu Ser 

275 290 285 

Asn Ala Lys Glu Thr Phe Asn Pro Leu Val Ala Ala Phe Pro Ser Val 

290 295 300 

Asn Val Ser Met Glu Lys Val He Leu Ser Pro Asn Glu Asn Leu Ser 
305 3i0 315 

Asn Ser Val Glu Ser His Ser Ser Thr Asn Trp Ser Tyr Thr Asn Thr 
325 330 

r 

Glu Gly Ala Ser Val Glu Ala Gly He Gly Pro Lys Gly lie Ser Phe 



340 345 



Gly 



val Ser Val Asn Tyr Gin His Ser Glu Thr Val Ala Gin Glu Trp 
355 360 365 
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Gly Thr Ser Thr Gly Asn Thr Ser Gin Phe Asn Thr Ala Ser Ala Gly 
370 375 380 

Tyr Leu Asn Ala Asn Val Arg Tyr Asn Asn Val Gly Thr Gly Ala He 
385 390 395 400 

Tyr Asp Val Lys Pro Thr Thr Ser Phe Val Leu Asn Asn Asp Thr He 
405 410 415 

Ala Thr He Thr Ala Lys Ser Asn Ser Thr Ala Leu Asn He Ser Pro 
420 — 425 430 

Gly Glu Ser Tyr Pro Lys Lys Gly Gin Asn Gly He Ala lie Thr Ser 
435 440 445 

Met Asp Asp Phe Asn Ser His Pro lie Thr Leu Asn Lys Lys Gin Val 
450 455 460 

Asp Asn Leu Leu Asn Asn Lys Pro Met Met Leu Glu Thr Asn Gin Thr 
465 470 475 480 

Asp Gly Val Tyr Lys He Lys Asp Thr His Gly Asn He Val Thr Gly 
4B5 490 495 

Gly Glu Trp Asn Gly Val lie Gin Gin He Lys Ala Lys Thr Ala Ser 
500 505 510 

He lie Val Asp Asp Gly Glu Arg Val Ala Glu Lys Arg Val Ala Ala 
515 520 525 

Lys Asp Tyr Glu Asn Pro Glu Asp Lys Thr Pro Ser Leu Thr Leu Lys 
530 535 540 

Asp Ala Leu Lys Leu Ser Tyr Pro Asp Glu lie Lys Glu lie Glu Gly 
545 550 555 560 

Leu Leu Tyr Tyr Lys Asn Lys Pro lie Tyr Glu Ser Ser Val Met Thr 
565 570 575 

Tyr Leu Asp Glu Asn Thr Ala Lys Glu Val Thr Lys Gin Leu Asn Asp 
580 585 590 

Thr Thr Gly Lys Phe Lys Asp Val Ser His Leu Tyr Asp Val Lys Leu 
595 600 605 

Thr Pro Lys Met Asn Val Thr lie Lys Leu Ser He Leu Tyr Asp Asn 
610 615 620 



Ala Glu Ser Asn Asp Asn Ser lie Gly Lys Trp Thr Asn Thr Asn He 
625 630 635 640 



Val Ser Gly Gly Asn Asn Gly Lys Lys Gin Tyr Ser Ser Asn Asn Pro 
645 650 655 
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Asp Ala Asn Leu Thr Leu Asn Thr Asp Ala Gin Glu Lys Leu Asn Lys 
660 665 670 

Asn Arg Asp Tyr Tyr He Ser Leu Tyr Met Lys Ser Glu Lys Asn Thr 

675 680 685 

Gin Cys Glu He Thr He Asp Gly Glu He Tyr Pro He Thr Thr Lys 
690 695 700 

Thr Val Asn Val Asn Lys Asp Asn Tyr Lys^Arg Leu Asp He He Ala 

_.705— 710 715 720 

His Asn He Lys Ser Asn Pro lie Ser Ser He His He Lys Thr Asn 
725 730 735 

Asp Glu He Thr Leu Phe Trp Asp Asp He Ser lie Thr Asp Val Ala 
740 745 750 

Ser He Lys Pro Glu Asn Leu Thr Asp Ser Glu He Lys Gin He Tyr 
755 760 765 

Ser Arg Tyr Gly He Lys Leu Glu Asp Gly He Leu He Asp Lys Lys 
770 775 780 

Glv Gly He His Tyr Gly Glu Phe He Asn Glu Ala Ser Phe Asn He 
785 790 795 BOO 

Glu Pro Leu Gin Asn Tyr Val Thr Lys Tyr Lys Val Thr Tyr Ser Ser 
805 810 815 

Glu Leu Gly Gin Asn Val Ser Asp Thr Leu Glu Ser Asp Lys He Tyr 
820 825 830 

Lys Asp Gly Thr He Lys Phe Asp Phe Thr Lys Tyr Ser Xaa Asn Glu 
835 640 845 

Gin Gly Leu Phe Tyr Asp Ser Gly Leu Asn Trp Asp Phe Lys He Asn 
850 855 860 

Ala He Thr-Tyr Asp Gly Lys Glu Met Asn Val Phe His Arg Tyr Asn 
865 870 875 880 

Lys 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 2 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : single 
( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 17718 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

TGGATTAATT GGGTATTATT TCAAAGGAAA AGATTTTAAT AATCTTACTA TGTTTGCACC 60 

GACACGTGAT AATACCCTTA TGTATGACCA ACAAACAGCG AATGCATTAT TAGATAAAAA 120 

ACAACAAGAA TATCAGTCCA TTCGTTGGAT TGGTTTGATT CAGAGTAAAG AAACGGGCGA 180 

TTTCACATTT AACTTATCAA AGGATGAACA GGCAATTATA GAAATCGATG GGAAAATCAT 240 

TTCTAATAAA GGGAAAGAAA AGCAAGTTGT CCATTTAGAA AAAGAAAAAT TAGTTCCAAT 3 00 

CAAAATAGAG TATCAATCAG ATACGAAATT TAATATTGAT AGTAAAACAT TTAAAGAACT 3 60 

TAAATTATTT AAAATAGATA GTCAAAACCA ATCTCAACAA GTTCAACTGA GAAACCCTGA 42 0 

ATTTAACAAA AAAGAATCAC AGGAATTTTT AGCAAAAGCA TCAAAAACAA ACCTTTTTAA 4 80 

GCAAAAAATG AAAAGAGATA TTGATGAAGA TACGGATACA GATGGAGACT CCATTCCTGA 540 

TCTTTGGGAA GAAAATGGGT ACACGATTCA AAATAAAGTT GCTGTCAAAT GGGATGATTC 600 

GCTAGCAAGT AAGGGATATA CAAAATTTGT TTCGAATCCA TTAGACAGCC ACACAGTTGG 660 

CGATCCCTAT ACTGATTATG AAAAGGCCGC AAGGGATTTA GATTTATCAA ATGCAAAGGA 720 

AACGTTCAAC CCATTGGTAG CTGCTTTYCC AAGTGTGAAT GTTAGTATGG AAAAGGTGAT 780 

ATTATCACCA AATGAAAATT TATCCAATAG TGTAGAGTCT CATTCATCCA CGAATTGGTC 84 0 

TTATACGAAT ACAGAAG G AG CTTCCATTGA AGCTGGTGGC GGTCCATTAG GCCTTTCTTT 900 

TGGAGTGAGT GTTAATTATC AACACTCTGA AACAGTTGCA CAAGAATGGG GAACATCTAC 960 

AGGAAATACT TCACAATTCA ATACGGCTTC AGCGGGATAT TTAAATGCCA ATATACGATA 1020 

TA 1022 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 17718 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Gly Leu lie Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 
15 10 15 

Met Phe Ala Pro Thr Arg Asp Asn Thr Leu Met Tyr Asp Gin Gin Thr 
20 25 30 

Ala Asn Ala Leu Leu Asp Lys Lys Gin Gin Glu Tyr Gin Ser lie Arg 
35 40 45 

Trp lie Gly Leu lie Gin Ser Lys Glu Thr Gly Asp PKe~Thr Phe Asn 
50 55 60 

Leu Ser Lys Asp Glu Gin Ala lie He Glu He Asp Gly Lys He He 
65 70 75 80 

Ser Asn Lys Gly Lys Glu Lys Gin Val Val His Leu Glu Lys Glu Lys 
85 90 95 

Leu Val Pro He Lys He Glu Tyr Gin Ser Asp Thr LyB Phe Asn He 
100 105 110 

Asp Ser Lys Thr Phe Lys Glu Leu Lys Leu Phe Lys He Asp Ser Gin 
115 120 125 

Asn Gin Ser Gin Gin Val Gin Leu Arg ABn Pro Glu Phe Asn Lys Lys 
130 135 140 

Glu Ser Gin Glu Phe Leu Ala Lys Ala Ser Lys Thr Asn Leu Phe Lys 
145 150 155 160 

Gin Lys Met Lys Arg Asp He Asp Glu Asp Thr Asp Thr Asp Gly Asp 
165 170 175 

Ser He Pro Asp Leu Trp Glu Glu Asn Gly Tyr Thr lie Gin Asn Lys 
180 185 190 

Val Ala Val Lys Trp Asp Asp Ser Leu Ala Ser Lys Gly Tyr Thr Lys 
195 200 205 

Phe Val Ser Asn Pro Leu Asp Ser His Thr Val Gly Asp Pro Tyr Thr 
210 215 220 

Asp Tyr Glu Lys Ala Ala Arg Asp Leu Asp Leu Ser Asn Ala Lys Glu 
225 230 235 240 

Thr Phe Asn Pro Leu Val Ala Ala Xaa Pro Ser Val Asn Val Ser Met 
245 250 255 

Glu Lys Val He Leu Ser Pro Asn Glu Asn Leu Ser Asn Ser Val Glu 
260 265 270 



Ser His Ser Ser Thr Asn Trp Ser Tyr Thr Asn Thr Glu Gly Ala Ser 
275 280 285 
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He Glu Ala Gly Gly Gly Pro Leu Gly Leu Ser Phe Gly Val Ser Val 
290 295 300 

Asn Tyr Gin His Ser Glu Thr Val Ala Gin Glu Trp Gly Thr Ser Thr 
305 310 315 320 

Gly Asn Thr Ser Gin Phe Asn Thr Ala Ser Ala Gly Tyr Leu Asn Ala 
325 330 335 

Asn He Arg Tyr 

34 0 — 



{2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1B5AA2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TGGATTAATT GGGTATTATT TCCAGGAGCA AAACTTTGAG AAACCCGCTT TGATAGCAAA 60 

TAGACAAGCT TCTGATTTGG AAATACCGAA AGATGACGTG AAAGAGTTAC TATCCAAAGA 120 

ACAGCAACAC ATTCAATCTG TTAGATGGCT TGGCTATATT CAGCCACCTC AAACAGGAGA 1B0 

CTATGTATTG TCAACCTCAT CCGACCAACA GGTCGTGATT GAACTCGATG GAAAAACCAT 24 0 

TGTCAATCAA ACTTCTATGA CAGAACCGAT TCAACTAGAA AAAGATAAAC GCTATAAAAT 3 00 

TAGAATTGAA TATGTCCCAG GAGATACACA AGGACAAGAG AACCTTCTGG ACTTTCAACT 3 60 

GAAGTGGTCA ATTTCAGGAG CCGAGATAGA ACCAATTCCG GATCATGCTT TCCATTTACC 420 

AGATTTTTCT CATAAACAAG ATCAAGAGAA AATCATCCCT GAAACCAATT TATTTCAGAA 4 80 

ACAAGGAGAT GAGAAAAAAG TATCACGCAG TAAGAGATCT TCAGATAAAG ATCCTGACCG 540 

TGATACAGAT GATGATAGTA TTTCTGATGA ATGGGAAACG AGTGGATATA CCATTCAAAG 600 

ACAGGTGGCA GTGAAATGGG ACGATTCTAT GAAGGAGCTA GGTTATACCA AGTATGTGTC 660 

TAACCCTTAT AAGTCTCGTA CAGTAGGAGA TCCATACACA GATTGGGAAA AAGCGGCTGG 720 

CAGTATCGAT AATGCTGTCA AAGCAGAAGC CAGAAATCCT TTAGTCGCGG CCTATCCAAC 780 

TGTTGGTGTA CATATGGAAA GATTAATTGT CTCCGAACAA CAAAATATAT CAACAGGGCT 84 0 
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TGGAAAAACC GTATCTGCGT CTACGTCCGC AAGCAATACC GCAGCGATTA CGGCAGGTAT 
TGATGCAACA GCTGGTGCCT CTTTACTTGG GCCATCTGGA AGTGTCACGG CTCATTTTTC 
TTACACGGGA TCTAGTACAG CCACCATTGA AGATAGCTCC AGCCGTAATT GGAGTCGAGA 
CCTTGGGATT GATACGGGAC AAGCTGCATA TTTAAATGCC AATATACGAT ATA 

(2) INFORMATION FOR SEQ ID NO;36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1B5AA2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Gly Leu He Gly Tyr Tyr Phe Gin Glu Gin Asn Phe Glu Lys Pro Ala 
! 5 *° 15 

Leu lie Ala Asn Arg Gin Ala Ser Asp Leu Glu lie Pro Lys Asp Asp 
20 25 30 

Val Lys Glu Leu Leu Ser Lys Glu Gin Gin His lie Gin Ser Val Arg 
35 40 45 

Trp Leu Gly Tyr lie Gin Pro Pro Gin Thr Gly Asp Tyr Val Leu Ser 
50 55 60 

Thr Ser Ser Asp Gin Gin Val Val He Glu Leu Asp Gly Lys Thr lie 
65 70 75 80 

Gin Thr Ser Met Thr Glu Pro He Gin Leu Glu Lys Asp Lys 



900 
960 
1020 
1073 



Val Asn 

85 



90 95 



Arg Tyr Lys lie Arg He Glu Tyr Val Pro Gly Asp Thr Gin Gly Gin 
100 1° 5 110 

Glu Asn Leu Leu Asp Phe Gin Leu Lys Trp Ser lie Ser Gly Ala Glu 
115 120 125 

He Glu Pro He Pro Asp His Ala Phe His Leu Pro Asp Phe Ser His 



130 



135 



140 



Lys Gin Asp Gin Glu Lys lie He Pro Glu Thr Asn Leu Phe Gin Lys 



145 



150 
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Gin Gly Asp Glu Lys Lys Val Ser Arg Ser Lys Arg Ser Ser Asp Lys 
165 170 175 

Asp Pro Asp Arg Asp Thr Asp Asp Asp Ser lie Ser Asp Glu Trp Glu 
180 185 190 

Thr Ser Gly Tyr Thr He Gin Arg Gin Val Ala Val Lys Trp Asp Asp 
ISS 200 205 

Ser Met Lys Glu Leu Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Lys 
210 215 220 - 

Ser Arg Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gly 
225 230 235 240 

Ser He Asp Asn Ala Val Lys. Ala Glu Ala Arg Asn Pro Leu Val Ala 
245 250 255 

Ala Tyr Pro Thr Val Gly Val His Met Glu Arg Leu He Val Ser Glu 
260 265 270 

Gin Gin Asn He Ser Thr Gly Leu Gly Lys Thr Val Ser Ala Ser Thr 
275 280 285 

Ser Ala Ser Asn Thr Ala Ala lie Thr Ala Gly He Asp Ala Thr Ala 
290 295 300 

Gly Ala Ser Leu Leu Gly Pro Ser Gly Ser Val Thr Ala His Phe Ser 
305 310 315 320 

Tyr Thr Gly Ser Ser Thr Ala Thr He Glu Asp Ser Ser Ser Arg Asn 
325 330 335 

Trp Ser Arg Asp Leu Gly 'He Asp Thr Gly Gin Ala Ala Tyr Leu Asn 
340 345 350 

Ala Asn lie Arg Tyr 
355 



) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

{C) INDIVIDUAL ISOLATE: 196F3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
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TGGGTTACNT GGGTATTAYT TTCAGGATAC TAAATTTCAA CAACTTGCTT TAATGGCACA 
TAGACAAGCC TCAGATTTAG AAATAAACAA AAATGAMGTC AAGGATTTAC TATCAAAGGA 
TCAACAACAC ATTCAAGCAG TGAGATGGAT GGGCTATATT CAGCCACCTC AAACAGGAGA 
TTATGTATTG TCAACTTCAT CCGACCAACA GGTCTTCACC GAACTCNATG GAAAAATAAT 
TCTCAATCAA TCTTCTATGA CCGAACCCAT TCGATTAGAA AAAGATAAAC AATATAMAAT 
TAGAATTGAA TATGTATCAK AAAGTAAAAC AGAAAAAGAG ACGCTCCTAG ACTTTCAACT 
CAACTGGTCG ATTTCAGGTG CTACGGTAGA ACCAATTCCA GATAATGCTT TTCAGTTACC 
AGATCTTTCT CGGGAACAAG NTAAAGATAA AATCATCCCT GAAACAAGTT TATTGCAGGA 
TCAAGGAGAA GGGAAACAAG TATCTCGAAG TAAAAGATCT CTAGCTGTGA ATCCTCTACA 
CGATACAGAT GATGATGGGA TTTACGATGA ATGGGAAACA AGCGGCTATA CGATTCAAAG 
ACAATTGGCA GTAAGATGGA ACGATTCTAT GAAGGATCAA GGCTATACCA AATATGTGTC 
TAATCCTTAT AAGTCTCATA CTGTAGGAGA TCCATACACA GACTGGGAAA AAGCAGCTGG 
ACGTATCGAC CAAGCTGTGA AAATAGAAGC CAGAAACCCA TTAGTTGCAG CATATCCAAC 
AGTTGGCGTA CATATGGAAA GACTGATTGT CTCTGAAAAA CAAAATATAG CAACAGGACT 
GGGAAAAACA GTATCTGCGT CTACATCTGC AAGTAATACA GCGGGGATTA CAGCGGGAAT 
QGATGCAACG GTTGGTGCCT CTTTACTTGG ACCTTCGGGA AGTGTCACCG CCCATTTTTC 
TTATACGGGT TCGAGTACAT CCACTGTTGA AAATAGCTCG AGTAATAATT GGAGTCAAGA 
TCTTGGTATT GATACCAGCC AATCTGCGTA CTTAAATGCC AATGTAAGAT ATA 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 196F3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Gly Leu Xaa Gly Tyr Xaa Phe Gin Asp Thr Lys Phe Gin Gin Leu Ala 
1 5 10 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1073 
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cpr Leo Leu Qlu He Asn Lys Asn Xaa 
Leu Met Ala His Arg Gin Ala Ser Asp Leu ^ 

20 

1M ser Ly9 Asp Gin Gin His lie Gin Ala Val Arg 
val Lys Asp Leu Leu Ser Lys 45 

35 40 

, ns ti^ Gin Pro Pro Gin Thr Gly Asp Tyr Val Leu Ser 
Trp Met Gly Tyr He Gin Pro *r ^ 

50 55 



* m« Gin Val Phe Thr Glu Leu Xaa Gly Lys He lie 
Thr Ser Ser Asp Gin Gin vai rne _ 80 

1 M. «. s« s« « Thr «. » «e *. « «• «" 

85 3 

, . tt Glu Tvr Val Ser Xaa Ser Lye Thr Glu Lys 
Gin Tyr Xaa lie Arg He Glu Tyr vai ^ 

100 iU3 

n. T„r u. » «p «- «- ™ - s " Ile S °' y *" 

115 120 

« «. » u. x. »p »» - «- °>" - S MP S " 

130 135 

TU Tle Pro Glu Thr Ser Leu Leu Gin Asp 
Glu Gin Xaa Lys Asp Lys lie lie Pro eiu ^ 
150 

145 ADU 

ser Arq Ser Lys Arg Ser Leu Ala Val 
Gin Gly Glu Gly Lys Gin Val Ser Arg y ^ 

165 

_ « His »P Thr W »P « W «• V ».P J. Tr P "« 
180 185 

«« s.r «y ^ « »« - « *" "* S TO " 

195 ZUU 

s« „« «. W Oiy Tyr Thr «. Tyr v.! 6« ™ «• V W 
210 215 

iv,r Thr a.bd Tro Glu Lys Ala Ala Gly 
Ser His Thr Val Gly Asp Pro Tyr Thr Asp Trp ^ 

225 230 

Arg ne Asp Gin Ala Val Lys He Glu jU ^ - -» - Val Ala 
24 5 

». „r Pro « « « V.! «. « «- «• - «• S ~ 
260 

Ly . ol „ j. n. »• « «r - «v <»' * « £ " ™ 
s„ u , Tr « « «• w »• • «* S ~ 

290 295 
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Gly Ala Ser Leu Leu Gly Pro Ser Gly Ser Val Thr Ala His Phe Ser 
305 310 315 



Tyr Thr Gly Ser Ser Thr Ser Thr Val Glu Asn Ser Ser Ser Asn Asn 
325 330 335 

Trp Ser Gin Asp Leu Gly He Asp Thr Ser Gin Ser Ala Tyr Leu Asn 
340 345 350 

Ala Asn- Val Arg Tyr ^ — 

355 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE : 

(C) INDIVIDUAL ISOLATE: 196 J4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

TGGGTTAATT GGGTATTATT TCCAGGATCA AAAGTTTCAA CAACTTGCTT TAATGGCACA 

TAGACAAGCT TCTAATTTAA ACATACCAAA AAATGAAGTG AAACAGTTAT TATCCGAAGA 

TCAACAACAT ATTCAATCCG TTAGGTGGAT CGGATATATC AAATCACCTC AAACGGGAGA 

TTATATATTG TCAACTTCAG CCGATCGACA TGTCGTAATT GAACTTGACG GAAAAACCAT 

TCTTAATCAA TCTTCTATGA CAGCACCCAT TCAATTAGAA AAAGATAAAC TTTATAAAAT 

TAGAATTGAA TATGTCCCAG AAGATACAAA AGGACAGGAA AACCTCTTTG ACTTTCAACT 

GAATTGGTCA ATTTCAGGAG ATAAGGTAGA ACCAATTCCG GAGAATGCAT TTCTGTTGCC 

AGACTTTTCT CATAAACAAG ATCAAGAGAA AATCATCCCT GAAGCAAGTT TATTCCAGGA 

ACAAGAAGAT GCAAACAAAG TCTCTCGAAA TAAACGATCC ATAGCTACAG GTTCTCTGTA 

TGATACAGAT GATGATGCTA TTTATGATGA ATGGGAAACA GAAGGATACA CGATACAACG 

TCAAATAGCG GTGAAATGGG ACGATTCTAT GAAGGAGCGA GGTTATACCA AGTATGTGTC 

TAACCCCTAT AATTCGCATA CAGTAGGAGA TCCCTACACA GATTGGGAAA AAGCGGCTGG 

ACGCATTGAT CAGGCAATCA AAGTAGAAGC TAGGAATCCA TTAGTTGCAG CCTATCCAAC 

AGTTGGTGTA CATATGGAAA AACTGATTGT TTCTGAGAAA CAAAATATAT CAACTGGGGT 840 



60 
120 
180 . 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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TGGAAAAACA GTATCTGCGG CTATGTCCAC TGGTAATACC GCAGCGATTA CGGCAGGAAT 
TGATGCGACC GCCGGGGCAT CTTTACTTGG ACCTTCTGGA AGTGTGACGG CTCATTTTTC 
TTATACAGGG TCTAGTACAT CTACAATTGA AAATAGTTCA AGCAATAATT GGAGTAAAGA 
TCTGGGAATC GATACGGGGC AATCTGCTTA TTTAAATGCC AATGTACGAT ATA 

(2) INFORMATION FOR SEQ IDJJO:40: 

(i) SEQUENCE CHARACTERISTICS: _ 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 196J4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Gly Leu lie Gly Tyr Tyr Phe Gin Asp Gin Lys Phe Gin Gin Leu Ala 
1 5 10 15 

Leu Met Ala His Arg Gin Ala Ser Asn Leu Asn lie Pro Lys Asn Glu 
20 2 ^ 

Val Lys Gin Leu Leu Ser Qlu Asp Gin Gin His lie Gin Ser Val Arg 

35 40 45 

Trp lie Gly Tyr He Lys Ser Pro Gin Thr Gly Asp Tyr lie Leu Ser 
50 " 55 60 

Ser Ala Asp Arg His Val Val lie Glu Leu Asp Gly Lys Thr lie 

75 bU 



900 
960 
1020 
1073 



Thr 
65 



70 



Leu Asn Gin Ser Ser Met Thr Ala Pro lie Gin Leu Glu Lys Asp Lys 

90 " 



85 



Tyr Lys lie Arg He Glu Tyr Val Pro Glu Asp Thr Lys Gly Gin 
1 105 110 



Leu 

100 



Glu Asn Leu Phe Asp Phe Gin Leu Asn Trp Ser lie Ser Gly Asp Lys 
115 120 125 

Val Glu Pro lie Pro Glu Asn Ala Phe Leu Leu Pro Asp Phe Ser His 
130 "5 "0 

Lys Gin Asp Gin Glu Lys lie lie Pro Glu Ala Ser Leu Phe Gin Glu 
i Qfi 155 .Lou 

145 150 
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01B Olu Asp Ala Asn Lys Val Ser Ar 9 Asn Lys Ar 9 Ser He Ala Thr 
165 170 



Oly Ser Leu Tyr Asp Thr Asp Asp Asp Ala lie Tyr Asp Glu Trp Olu 
180 185 

a, U. Oly Tyr T„r He U. «. «" «U »>■ « «p -P U, 
S „ „« Lys 01. «9 Cly-Tyr Thr Ly. Tyr V,! Ser »n Pro Tyr «» 

210 215 

Ser His Thr Val Oly Asp Pro Tyr Thr Asp Trp Olu Lys Ala Ala Oly 

225 230 

Arg ne Asp Gin Ala He Lys Val Olu Ala Arg Asn Pro Leu Val Ala 
245 250 

Ala Tyr Pro Thr Val Oly Val His Met Olu Lys Leu lie Val Ser Olu 

260 265 
Ly8 Oln Asn lie Ser Thr Oly Val Oly Lys Thr Val Ser Ala Ala Met 

Ser T h r Oly Asn Thr Ala Ala Xle Thr Ala Oly lie Asp Ala T h r Ala 
290 295 

- , ser Glv Ser Val Thr Ala His Phe Ser 

Gly Ala Ser Leu Leu Gly Pro Ser oiy *e ^ 

305 310 315 

Tyr Thr Gly Ser Ser T h r Ser Thr He Olu Asn Ser Ser Ser Asn Asn 
325 330 

r , nv Tie Asp Thr Gly Gin Ser Ala Tyr Leu Asn 
Trp Ser Lys Asp Leu Gly lie Asp Tnr y ^ 



340 345 



Ala Asn Val Arg Tyr 
355 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



ii) MOLECULE TYPE: DNA (genomic) 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 197T1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
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TGGATTAATT GGGTATTATT TTAAAGGAAA AGATTTTAAT AATCTTACTA TATTTGCTCC 
AACACGTGAG AATACTCTTA TTTATGATTT AGAAACAGCG AATTCTTTAT TAGATAAGCA 
ACAACAAACC TATCAATCTA TTCGTTGGAT CGGTTTAATA AAAAGCAAAA AAGCTGGAGA 

rrrrACCTTT caattatcgg atgatgagca tgctattata gaaatcgatg ggaaagttat 

TTCGCAAAAA GGCCAAAAGA AACAAGTTGT TCATTTAGAA AAAGATAAAT TAGTTCCCAT^ 
CAAAATTGAA TATCAATCTG ATAAAGCGTT AAACCCAGAC AGTCAAATGT TTAAAGAATT 

gaaattattt aaaataaata gtcaaaaaca atctcagcaa gtgcaacaag acgaattgag 

AAATCCTGAA TTTGGTAAAG AAAAAACTCA AACATATTTA AAGAAAGCAT CGAAAAGCAG 

CTTGTrrAGC aataaaagta aacgagatat aga^aagat ataga^agg atacagatac 

AGATGGAGAT GCCATTCCTG ATGTATGGGA AGAAAATGGG TATACCATCA AAGGAAGAGT 

AGCTGTTAAA tgggacgaag gattagctga taagggatat aaaaagtttg tttccaatcc 

TTTTAGACAG CACACTCCTG GTGACCCCTA TAGTGACTAT GAAAAGGCAT CAAAAGATTT 
GGATTTATCT AATGCAAAAG AAACATTTAA TCCATTGGTG GCTGCTTTTC CAAGTGTCAA 
TGTTAGCTTG GAAAATGTCA CCATATCAAA AGATGAAAAT AAAACTGCTG AAATTGCGTC 
TACTTCATCG AATAATTGGT CCTATACAAA TACAGAGGGG GCATCTATTG AAGCTGGAAT 
TGGACCAGAA GGTTTGTTGT CTTTTGGAGT AAGTGCCAAT TATCAACATT CTGAAACAGT 
GGCCAAAGAG TGGGGTACAA ^TAAGGGAGA CGCAACACAA TATAATACAG CTTCAGCAGG 
ATATCTAAAT GCCAATGTAC GATATA 

(2) INFORMATION FOR SBQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 

{C) STRANDEDNESS : Bingle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 197T1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Gly Leu lie Gly Tyr Tyr Phe Lys Gly Lys Asp Phe Asn Asn Leu Thr 
l 5 



60 
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He Phe Ala Pro Thr Arg Glu Asn Thr Leu He Tyr Asp Leu Glu Thr 
20 25 



Ma A 8 n ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr Gin Ser lie Arg 

35 40 
Trp Xle Gly Leu lie Lys Ser Lys Lys Ala Gly Asp Phe Thr Phe Gin 
50 55 

* rm His Ala lie lie Glu He Asp Gly Lys Val lie 

Leu Ser Asp Asp Glu His Aia — gg 

65 ^® —to 

ser Gin Lys Gly Gin Lys Lys Gin Val Val His Leu Glu Lys Asp Lys 

85 90 
L eu Val Pro lie Lys He Glu Tyr Gin Ser Asp Lys Ala Leu Asn Pro 

100 105 
Aap ser Gin Met: Phe Lys Glu Leu Lys Leu Phe Lys lie Asn Ser Gin 

115 120 
Lys Gin Ser Gin Gin Val Gin Gin Asp Glu Leu Arg Asn Pro Glu Phe 
130 135 

Thr- rin Thr Tyr Leu Lys Lys Ala Ser Lys Ser Ser 
Gly Lys Glu Lys Thr Gin Tnr xyr ueu i j igQ 

145 150 1 

L eu Phe Ser Asn Lys Ser Lys Arg Asp He Asp Glu Asp He Asp Glu 
165 1MJ 

»er> riv asd Ala He Pro Asp Val Trp Glu Glu Asn 
Asp Thr Asp Thr Asp Gly Asp Aia xx * ^ 

180 185 
Gly xyr Thr He Lys Gly Arg Val Ala Val Lys Trp Asp Glu Gly Leu 

7 1 195 200 205 

A la Asp Lys Gly Tyr Lys Lys Phe Val Ser Asn Pro Phe Arg Gin His 

210 215 
Thr Ala Gly Asp Pro Tyr Ser Asp Tyr Glu Lys Ala-Ser Lys Asp Leu 
225 2 ^° 

Lp Leu Ser Asn Ala Lys Glu Thr Phe Asn Pro Leu Val Ala Ala Phe 
245 

t, i **n val Ser Leu Glu Asn Val Thr lie Ser Lys Asp Glu 
Pro Ser Val Asn vai faer 2?o 

260 265 
A6 n Lys Thr Ala Glu He Ala Ser Thr Ser Ser Asn Asn Trp Ser Tyr 

275 280 

Th r Asn Thr Glu Gly Ala Ser He Glu Ala Gly XI. Gly Pro Glu Gly 



295 300 
290 29b 
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Leu Leu Ser Phe Gly Val Ser Ala Asn Tyr Gin His Ser Glu Thr Val 
305 310 315 320 

Ala Lys Glu Trp Gly Thr Thr Lys Gly Asp Ala Thr Gin Tyr Asn Thr 
325 330 335 

Ala Ser Ala Gly Tyr Leu Asn Ala Asn Val Arg Tyr 
340 345 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: — 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 197U2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

TGGGTTAATT GGGTATTATT TTACGGATGA GCAGCATAAG GAAGTAGCTT TTAYTCAATT 60 

AGGTGAAAAA AMTACATTAG CAGATTCAGC GAAAATGAAG AAAAACGACA AAAAGATTCT 120 

TTCAGCGCAA TGGATTGGWA ATATACAGGT ACCTCAAACA GGGGAATATA CGTTTTCCAC 180 

CTCTTCTGAT AAAGATACTA TTTTAAAACT CAATGGGGAA ACGATTATTC AAAAATCTAA 240 

TATGGAGAAA CCCATATATT TAGAAAAAGA TAAAGTATAC GAAATTCAAA TCGAGCATAA 3 00 

CAACCCGAAT AGTGAGAAAA CTTTACGATT ATCTTGGAAA ATGGGGGGCA CCAATTCAGA 360 

GCTCATCCCA GAAAAATACA TTCTGTCTCC CGATTTTTCT AAAATAGCAG ATCAAGAAAA 420 

TGARAAAAAA GACGCATCGA GACATTTATT ATTTACTAAG GATGAATTGA AAGATTCTGA 480 

TAAGGACCTT ATCCCAGATG AATTTGAAAA AAATGGGTAT ACATTCAATG GGATTCAAAT 54 0 

TGTTCCTTGG GATGAATCTC TTCAAGAACA GGGCTTTAAA AAATATATTT CCAATCCATA 600 

TCAATCGCGT ACAGCGCAGG ATCCATATAC AGATTTTGAA AAAGTAACCG GATATATGCC 660 

TGCCGAAACA CAACTGGAAA CGCGTGACCC TTTAGTTGCG GCTTATCCGG CTGTAGGGGT 720 

TACGATGGAA CAGTTTATTT TCTCTAAAAA TGATAATGTG CAGGAATCTA ATGGTGGAGG 780 

AACTTCAAAA AGTATGACAG AAAGTTCTGA AACGACTTAC TCTGTTGAGA TAGGAGGGAA 840 

ATTTACATTG AATCCATTCG CACTGGCGGA AATTTCTCCT AAATATTCTC ACAGTTGGAA 900 

AAATGGAGCA TCTACAACAG AGGGAGAAAG TACTTCCTGG AGCTCACAAA TTGGTATTAA 960 
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CACGGCTGAA CGCGCGTTTT TTAAATGCCA ATATTCGATA TA 1002 

(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear — 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 197U2 

(Xi) SEQUENCE DESCRIPTION: SEQ IDNO:44: 

Gly Leu lie Gly Tyr Tyr Phe Thr Asp Glu Gin His Lys Glu Val Ala 
1 5 10 15 

Phe Xaa Gin Leu Gly Glu Lys Xaa Thr Leu Ala Asp ser Ala Lys Met 
20 25 30 

hy5 Lys Asn Asp Lys Lys lie Leu Ser Ala Gin Trp lie xaa Asn lie 
35 «° 45 

Gin Val Pro Gin Thr Gly Glu Tyr Thr Phe Ser Thr Ser Ser Asp Lys 
SO 55 60 

As p Thr lie Leu Lys Leu Asn Gly Glu Thr lie lie Gin Lys Ser Asn 



65 __ 70 

Met Glu Lys Pro lie Tyr Leu Glu Lys Asp Lys val Tyr Glu lie Gin 
85 90 95 

He Glu His Asn Asn Pro Asn Ser Glu Lys Thr Leu Arg Leu Ser Trp 
100 105 110 

Lys Met Gly Gly Thr Asn Ser Glu Leu lie Pro Glu Lys Tyr lie Leu 
115 I 20 125 

Ser Pro Asp Phe Ser Lys lie Ala Asp Gin Glu Asn Xaa Lys Lys Asp 
130 135 "0 

Ala Ser Arg His Leu Leu Phe Thr Lys Asp Glu Leu Lys Asp Ser Asp 



145 



150 155 1" 



Lys Asp Leu lie Pro Asp Glu Phe Glu Lys Asn Gly Tyr Thr Phe Asn 
165 170 175 

Gly lie Gin lie Val Pro Trp Asp Glu Ser Leu Gin Glu Gin Gly Phe 
180 I 95 190 
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Lvs Lys Tyr He Ser Asn Pro Tyr Gin Ser Arg Thr Ala Gin Asp Pro 
195 200 205 

Tvr Thr Asp Phe Glu Lys Val Thr Gly Tyr Met Pro Ala Glu Thr Gin 
210 215 220 

Leu Glu Thr Arg Asp Pro Leu Val Ala Ala Tyr Pro Ala Val Gly Val 



225 



230 235 240 



Thr Met Glu Gin Phe lie Phe Ser Lys Asn Asp Asn Val Gin Glu Ser 
245 250 255 

Asn Gly Gly Gly Thr Ser Lys Ser Met Thr Glu Ser Ser Glu Thr Thr 
260 265 270 

Tvr Ser Val Glu lie Gly Gly Lys Phe Thr Leu Asn Pro Phe Ala Leu 
275 280 285 

Ala Glu lie Ser Pro Lys Tyr Ser His Ser Trp Lys Asn Gly Ala Ser 
290 295 300 

Thr Thr Glu Gly Glu Ser Thr Ser Trp Ser Ser Gin lie Gly lie Asn 
305 3" 315 320 

Thr Ala Glu Arg Ala Phe Phe Lys Cys Gin Tyr Ser He 
325 330 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE : 202E1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

TGGGTTAATT GGGTACTATT TTCAGGATCA AAAGTTTCAA CAACTCGCTT TGATGGCACA 60 

TAGACAAGCT TCAGATTTAG AAATACCTAA AAATGAAGTG AAGGATATAT TATCTAAAGA 120 

TCAACAACAT ATTCAATCAG TGAGATGGAG GGGGTATATT AAGCCACCTC AAACAGGAGA 180 

CTATATATTG TCAACCTCAT CCGACCAACA GGTCGTGATT GAACTCGATG GAAAAAACAT 24 0 

TGTCAATCAA ACTTCTATGA CAGAACCAAT TCAACTCGAA AAAGATAAAC TCTATAAAAT 300 

TAGAATTGAA TATGTCCCAG GAGATACAAA AGGACAAGAG AGCCTCCTTG ACTTTCAACT 



360 



1073 
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TAACTGGTCA ATTTCAGGAG ATACGGTGGA ACCAATTCCG GAGAATGCAT TTCTGTTACC 420 

AGACTTTTCT CATCAACAAG ATCAAGAGAA ACTCATCCCT GAAATCAGTC TATTTCAGGA 4 80 

ACAAGGAGAT GAGAAAAAAG TATCTCGTAG TAAGAGGTCT TTAGCTACAA ACCCTCTCCT 54 0 

TGATACAGAT GATGATGGTA TTTATGATGA ATGGGAAACG GAAGGATACA CAATACAGGG 600 

ACAACTAGCG GTGAAATGGG ACGATTCTAT GAAGGAGCGA GGTTATACTA AGTATGTGTC 660 

TAACCCTTAC AAGGCTCATA CAGTAGGAGA TCCCTACACA GATTGGGAAA~"AAGCGGCTGG 72 0 

CCGTATCGAT AACGCTGTCA AAGCAGAAGC TAGGAATCCT TTAGTCGCGG CCTATCCAAC 780 

TGTTGGTGTA CATATGGAAA GACTAATTGT CTCCGAAAAA CAAAATATAT CAACAGGACT 84 0 

TGGAAAAACC GTATCTGTGT CTATGTCCGC AAGCAATACC GCAGCGATTA CGGCAGGAAT 900 

TAATGCAACA GCCGGTGCCT CTTTACTTGG GCCATCTGGA AACGTCACGG CTCATTTTTC 960 

TTATACAGGA TCTAGTACAT CCACTGTTGA AAATAGCTCA AGTAATAATT GGAGTCAAGA 102 0 
TCTTGGAATC GATACGGGAC AATCTGCGTA TTTAAATGCC AATGTAAGAT ATA 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE; 202E1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Gly Leu He Gly Tyr Tyr Phe Gin Ab P Gin Lyfl Phe Gin Gin Leu Ala 
1 5 10 . 15 

Leu Met Ala His Arg Gin Ala Ser Asp Leu Glu lie Pro Lys Asn Glu 
20 25 30 

Val Lys Asp He Leu Ser Lys Asp Gin Gin His He Gin Ser Val Arg 
35 40 45 

Trp Arg Gly Tyr He Lys Pro Pro Gin Thr Gly Asp Tyr He Leu Ser 
50 55 60 

Thr Ser Ser Asp Gin Gin Val Val He Glu Leu Asp Gly Lys Asn He 
65 70 75 80 
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Val Asn Gin Thr Ser Met Thr Glu Pro lie Gin Leu Glu Lys Asp Lys 
85 90 95 

Leu Tyr Lys He Arg He Glu Tyr Val Pro Gly Asp Thr Lys Gly Gin 
100 105 110 

Glu Ser Leu Leu Asp Phe Gin Leu Asn Trp Ser He Ser Gly Asp Thr 
115 120 125 

Val Glu Pro He Pro Glu Asn Ala Phe Leu Leu Pro_Asp Phe Ser His 
130 135 _ 140 

Gin Gin Asp Gin Glu Lys Leu He Pro Glu He Ser Leu Phe Gin Glu 
145 150 155 160 

Gin Gly Asp Glu Lys Lys Val Ser Arg Ser Lys Arg Ser Leu Ala Thr 
165 170 175 

Asn Pro Leu Leu Asp Thr Asp Asp Asp Gly He Tyr Asp Glu Trp Glu 
180 185 190 

Thr Glu Gly Tyr Thr He Gin Gly Gin Leu Ala Val Lys Trp Asp Asp 
195 200 205 

Ser Met Lys Glu Arg Gly Tyr Thr Lys Tyr Val Ser Asn Pro Tyr Lys 
210 215 220 

Ala His Thr Val Gly Asp Pro Tyr Thr Asp Trp Glu Lys Ala Ala Gly 
225 230 235 240 

Arg He Asp Asn Ala Val Lys Ala Glu Ala Arg Asn Pro Leu Val Ala 
245 250 255 

Ala Tyr Pro Thr Val Gly Val His Met Glu Arg Leu He Val Ser Glu 
260 265 270 

Lys Gin Asn He Ser Thr Gly Leu Gly Lys Thr Val Ser Val Ser Met 
275 280 285 

Ser Ala Ser Asn Thr-Ala Ala He Thr Ala Gly He Asn Ala Thr Ala 
2 90 295 300 

Gly Ala Ser Leu Leu Gly Pro Ser Gly Asn Val Thr Ala His Phe Ser 
305 310 315 320 

Tyr Thr Gly Ser Ser Thr Ser Thr Val Glu Asn Ser Ser Ser Asn Asn 
325 330 335 

Trp Ser Gin Asp Leu Gly He Asp Thr Gly Gin Ser Ala Tyr Leu Asn 
340 345 350 



Ala Asn Val Arg Tyr 
355 
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(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 967 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: KB33 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TGGATTACTT GGGTACTATT TTGAAGAACC AAACTTTAAT GACCTTCTAT TAATCACACA 60 

AAAAAACAAC AGTAATTTAT CTCTAGAAAA AGAACATATT TCATCGTTAT CTAGTATTAG 12 0 

AAATAAAGGC ATTCAATCTG CTAGATGGTT AGGTTTTTTA AAACCAAAGC AAACGGATGA 180 

ATATGTTTTT TTTAGTCCTT CCAACCATGA AATCATGATT CAAATCGATA ACAAAATTAT 24 0 

TGTAATGGGT AGAAAAATTA TGTTAGAAGA AGGAAAGGTA TATCCAATTC GAATTGAATG 3 00 

CCGCTTTGAA AAAACAAATA ATCTAGATAT AAACTGCGAA CTACTTTGGA CGCATTCTGA 3 60 

TACAAAAGAA ATCATTTCTC AAAACTGTTT GCTGGCACCT GATTATCATA ATACAGAATT 420 

TTACCCAAAA ACAAATTTAT TTGGGGATGT ATCTACTACG ACTAGTGATA CTGATAATGA 4 80 

TGGAATACCA GATGACTGGG AAATTAATGG TTATACGTTT GATGGTACAA ATATAATTCA 54 0 

ATGGAATCCT GCTTATGAAG GGTTATATAC TAAATATATT TCTAACCCTA AACAAGCAAG 600 

TACAGTAGGT GATCCATATA CAGATTTAGA GAACGTMCAA AGCTAAAKGG ATCAAAGAAS 66 0 

CARGAAAYCC TTKTAGCAGA AGCTWATCCG AAAAATTGGA BTTAGCATGG AAGAATTACT 72 0 

CRTCTCTKTA WAARTGKTGA TKTWTTCAAA TGCTCAAGAA AATKACTACT TACTTCTAGT 780 

AGRACAGAAG GCACTTCASG TAGYGCAGGC ATTGAGGGAG GAGCAGAAGG AAAAAAACCT 84 0 

ACAGGATTGG TTTCAGCCTC CTTTTCGCAT TCATCTTCAA CAACAAACAC AACGGAACAA 900 

ATGAATGGAA CAATGATTCA TCTTGATACA GGAGAATCAG CGTATTTAAA TGCCAATGTA 960 

AGATATA 957 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 972 baBe pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: KB38 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

TGGATTACTT GGGTATTATT TTGAAGAACC AAACTTTAAT AACCTTCTAT TAATCACACA 60 

AAAAAACAAC AGTAATTTAT CTCTAGAAAA AGAACATATT TCATCGTTAT CTAGTATTAG 12 0 

AAATAAAGGC ATTCAATCTG CTAGATGGTT AGGTTTTTTA AAACCAGAGC AAACGGATGA 1B0 

ATATGTTTTT TTTAGTCCTT CCAACCATGA AATTATGATT CAAATCGATA ACAAAATTAT 24 0 

TGTAATGGGT AGAAAAATTA TGTTAGAAAA AGGAAAGGTA TATCCAATTC GAATTGAATG 3 00 

CCGCTTTGAA AAAACAAATA ATATAGATAT AAACTGCGAA CTACTTTGGA CGCACTCTGA 360 

TACAAAAGAA ATCATTTCTC AAAACTTTTT GCTGGCACCT GATTATAACA ATACAGAATT 420 

TTATCCAAAA ACAAATTTAT TTGGAGATGT ATCTACTACG ACTWAGTGAT ACTGATAATG 4 80 

ATGGAATACC AGATGACTGG GAAATTAATG GTTATACCTT TGATGGTACA AATATAATTC 54 0 

AGTGGAATTC TGCTTATGAA GGGTTATATA CTAAATATGT TTCTAATCCT AAACAAGCAA 600 

GTACAGTAGG TGATCCATAT ACAGATTTAG AGAAAGTAAC AGCTCAAATG GATCGAGCAA 660 

CCTCTCTAGA AGCAAGGAAT CCTTTAGTAG CAGCTTATCC AAAAATTGGA GTTAGCATGG 720 

AAGAATTACT CATCTCTTTA AATGTTGATT TTTCAAATGC TCAAGAAAAT ACTACTTCTT 780 

CTAGTAGAAC AGAAGGCACT TCACGTAGCG CAGGCATTGA GGGAGGAGCA GAAGGAAAAA 84 0 

AACCTACAGG ATTGGTTTCA GCCTCCTTTT CGCATTCATC TTCAACAACA AACACAACGG 900 

AACAAATGAA TGGAACAATG ATTCATCTTG ATACAGGAGA ATCAGCGTAT TTAAATGCCA 960 

ATGTAAGATA TA 972 

(2) INFORMATION FOR SEQ ID NO:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 9 : 
CTTGAYTTTA AARATGATRT A 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single — 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
AATRGCSWAT AAATAMGCAC C 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1341 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE : 

(C) INDIVIDUAL ISOLATE: 177C8 - vip2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

ATGTTTATGG TTTCTAAAAA ATTACAAGTA GTTACTAAAA CTGTATTGCT TAGTACAGTT 6 0 

TTCTCTATAT CTTTATTAAA TAATGAAGTG ATAAAAGCTG AACAATTAAA TATAAATTCT 120 

CAAAGTAAAT ATACTAACTT GCAAAATCTA AAAATCACTG ACAAGGTAGA GGATTTTAAA 180 

GAAGATAAGG AAAAAGCGAA AGAATGGGGG AAAGAAAAAG AAAAAGAGTG GAAACTAACT 24 0 

GCTACTGAAA AAGGAAAAAT GAATAATTTT TTAGATAATA AAAATGATAT AAAGACAAAT 300 

TATAAAGAAA TTACTTTTTC TATGGCAGGC TCATTTGAAG ATGAAATAAA AGATTTAAAA 360 

GAAATTGATA AGATGTTTGA TAAAACCAAT CTATCAAATT CTATTATCAC CTATAAAAAT 42 0 

GTGGAACCGA CAACAATTGG ATTTAATAAA TCTTTAACAG AAGG TAATAC GATTAATTCT 4 80 

GATGCAATGG CACAGTTTAA AGAACAATTT TTAGATAGGG ATATTAAGTT TGATAGTTAT 540 

CTAGATACGC ATTTAACTGC TCAACAAGTT TCCAGTAAAG AAAGAGTTAT TTTGAAGGTT 600 
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660 
720 
780 
840 
900 
960 
1020 
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ACGGTTCCGA GTGGGAAAGG TTCTACTACT CCAACAAAAG CAGGTGTCAT TTTAAATAAT 
AGTGAATACA AAATGCTCAT TGATAATGGG TATATGGTCC ATGTAGATAA GGTATCAAAA 
GTGGTGAAAA AAGGGGTGGA GTGCTTACAA ATTGAAGGGA CTTTAAAAAA GAGTCTTGAC 
TTTAAAAATG ATATAAATGC TGAAGCGCAT AGCTGGGGTA TGAAGAATTA TGAAGAGTGG 
GCTAAAGATT TAACCGATTC GCAAAGGGAA GCTTTAGATG GGTATGCTAG GCAAGATTAT 
AAAGAAATCA ATAATTATTT AAGAAATCAA. GGCGGAAGTG GAAATGAAAA ACTAGATGCT 
CAAATAAAAA ATATTTCTGA TGCTTTAGGG AAGAAACCAA TACCGGAAAA TATTACTGTG 
TATAGATGGT GTGGCATGCC GG AATTTGGT TATCAAATTA GTGATCCGTT ACCTTCTTTA 1080 
AAAGATTTTG AAGAACAATT TTTAAATACA ATCAAAGAAG ACAAAGGATA TATGAGTACA 114 0 
AGCTTATCGA GTGAACGTCT TGCAGCTTTT GGATCTAGAA AAATTATATT ACGATTACAA 
GTTCCGAAAG GAAGTACGGG TGCGTATTTA AGTGCCATTG GTGGATTTGC AAGTGAAAAA 
GAGATCCTAC TTGATAAAGA TAGTAAATAT CATATTGATA AAGTAACAGA GGTAATTATT 
AAGGTGTTAA GCGATATGTA G 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 amino, acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 177CB - vip2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Met Phe Met Val Ser Lys Lys Leu Gin Val Val Thr Lys Thr Val Leu 
1 5 10 15 

Leu Ser Thr Val Phe Ser lie Ser Leu Leu Asn Asn Glu Val He Lys 

25 30 



1200 
1260 
1320 
1341 
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Ala Glu Gin Leu Asn lie Asn Ser Gin Ser Lys Tyr Thr Asn Leu Gin 
35 40 45 

Asn Leu Lys lie Thr Asp Lys Val Glu Asp Phe Lye Glu Asp Lys Glu 
50 55 60 
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Lys Ala Lys Glu Trp Gly Lys Glu Lya Glu Lys Glu Trp Lys Leu Thr 
65 70 75 80 

Ala Thr Glu Lys Gly Lys Met Asn Asn Phe Leu Asp Asn Lys Asn Asp 
85 90 95 

lie Lys Thr Asn Tyr Lys Glu He Thr Phe Ser Met Ala Gly Ser Phe 
100 1° 5 110 

Glu Asp Glu lie LysL-Asp Leu Lys Glu lie Asp Lys Met Phe Asp Lys 
__X-15 120 125 

Thr Asn Leu Ser Asn Ser lie lie Thr Tyr Lys Asn Val Glu Pro Thr 
130 . 135 "0 

Thr He Gly Phe Asn Lys Ser Leu Thr Glu Gly Asn Thr He Asn Ser 

ica 155 I 60 

145 150 1S = 

Asp Ala Met Ala Gin Phe Lys Glu Gin Phe Leu Asp Arg Asp lie Lys 
165 17° 175 

Phe Asp Ser Tyr Leu Asp Thr His Leu Thr Ala Gin Gin Val Ser Ser 
180 IBS 190 

Lys Glu Arg Val lie Leu Lys Val Thr Val Pro Ser Gly Lys Gly Ser 
195 200 205 

Thr Thr Pro Thr Lys Ala Gly Val He Leu Asn Asn Ser Glu Tyr Lys 
210 215 220 



Met 

225 



Leu lie Asp Asn Gly Tyr Met Val His Val Asp Lys Val Ser Lys 
230 235 240 



Val Val Lys Lys Gly Val Glu Cys Leu Gin He Glu Gly Thr Leu Lys 

245 250 255 

Lys Ser Leu Asp Phe Lys Asn Asp lie Asn Ala Glu Ala His Ser Trp 

260 265 270 

Gly Met Lys Asn Tyr Glu Glu Trp Ala Lys Asp Leu Thr Asp Ser Gin 

275 280 285 

Arg Glu Ala Leu Asp Gly Tyr Ala Arg Gin Asp Tyr Lys Glu He Asn 



290 



295 



300 



Asn Tyr Leu Arg Asn Gin Gly Gly Ser Gly Asn Glu Lys Leu Asp Ala 



305 



310 315 320 



Gin He Lys Asn lie Ser Asp Ala Leu Gly Lys Lys Pro He Pro Glu 

325 330 335 

Asn He Thr Val Tyr Arg Trp Cys Gly Met Pro Glu Phe Gly Tyr Gin 

340 345 350 
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He Ser Asp Pro Leu Pro Ser Leu Lys Asp Phe Glu Glu Gin Phe Leu 
355 360 365 

Asn Thr He Lys Glu Asp Lys Gly Tyr Met Ser Thr Ser Leu Ser Ser 
370 375 380 

Glu Arg Leu Ala Ala Phe Gly Ser Arg Lys He He Leu Arg Leu Gin 
385 390 395 400 

Val Pro Lys Gly Ser Thr Gly Ala Tyr Leu Ser Ala He Gly Gly Phe 
405— 410 415 

Ala Ser Glu Lys Glu He Leu Leu Asp Lys Asp Ser Lys Tyr His He 
420 425 430 

Asp Lys Val Thr Glu Val He He Lys Val Leu Ser Asp Met 
435 440 445 



{2) INFORMATION FOR SEQ ID NO; 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
GGATTCGTTA TCAGAAA 



17 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CTGTYGCTAA CAATGTC 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 55: 

Ala Asp Glu Pro Phe Asn Ala Asp 
1 5 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
GCTGATGAAC CATTTAATGC C 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Leu Phe Lys Val Asp Thr Lys Gin 
1 5 



(2) INFORMATION FOR SEQ ID NO: SB: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 58: 



CTCTTTAAAG TAGATACTAA GC 



22 
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(2) INFORMATION FOR SEQ ID NO: 59: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ IDJJO:59: 

Pro Abp Glu Asn Leu Ser Asn lie Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
GATGAGAACT TATCAAATAG TATC 



(2) INFORMATION FOR SEQ. ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

• Ala Asn Ser Leu Leu Asp Lys Gin Gin Gin Thr Tyr 
1 5 10 



(2) INFORMATION FOR SEQ ID NO:62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
CGAATTCTTT ATTAGATAAG CAACAACAAA CCT 33 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 „amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: 

Val lie Ser Gin Lys Gly Gin Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GTTATTTCGC AAAAAGGCCA AAAG 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 
Glu Tyr Gin Ser Asp Lys Ala Leu Asn Pro 



(2) INFORMATION FOR SEQ ID NO: 66: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GAATATCAAT CTGATAAAGC GTTAAACCCA G 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Ser Ser Leu Phe Ser Asn Lye Ser Lys 
1 * 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GCAGCYTGTT TAGCAATAAA AGT 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
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He Lys Gly Arg Val Ala Val Lys 
1 * 5 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B ) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

CAAAGGAAGA GTAGCTGTTA 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 71 

Val Asn Val Ser Leu Glu Asn Val Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO; 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 
CAATGTTAGC TTGGAAAATG TCACC 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Thr Ala Phe He Gin Val Gly Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 
AGCATTTATT CAAGTAGGAG 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 

Tyr Leu Leu Ser Thr Ser Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 76 
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TCTACTTTCC ACGTCCTCT 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Gin lie Gin Pro Gin Gin Arg 
1 5 



(2) INFORMATION FOR SEQ ID NO:78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CAGATACAAC CGCAACAGC 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS:. 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79 

Pro Gin Gin Arg Ser Thr Gin Ser 
l 5 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 80: 

23 

CCGCAACAGC GTTCAACTCA ATC 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Asp Gly Ala He Val Ala Trp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: 

21 

GACGGTGCGA TTGTTGCCTG G 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

Glu Gly Asp Ser Gly Thr Val 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 84: 
GAAGGAGACT CAGGTACTG 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Thr Val Thr Asn Thr Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:86 
CCGTAACCAA TACAAGCAC 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 87: 

Ser Ser Gin Leu Ala Tyr Asn Pro Ser 
1 5 

(2) INFORMATION FOR SEQ ID N0:8B: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH^ 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
CTTCACAATT AGCGTATAAT CCTTC 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE:' amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 89: 

Glu Gin His Lys Glu Val Ala 
1 5 

(2) INFORMATION FOR SEQ ID NO: 90 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
GAGCAGCATA AGGAAGTAG 
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(2) INFORMATION FOR SEQ ID NO: 91: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
Phe Asn Gly lie Gln-Ile Val Pro 

1 5 



(2) INFORMATION FOR SEQ ID NO : 92 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
CATTCAATGG GATTCAAATT GTTCC 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

Val Gin Glu Ser Asn Gly Gly Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
GTGCAGGAAT CTAATGGTGG AGG 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUBNCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Glu He Gly Gly Lys Phe Thr Leu Asn 

-1 5 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Bingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
GATAGGAGGG AAATTTACAT TG 



(2) INFORMATION FOR SEQ ID NO; 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 97 : 
CGAATTGAAT GCCGCTTTG 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
CTCAAAACTX TTTGCTGGCA CC 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:99: 
GGATCRAGCA ACCTCTCTAG 

(2) INFORMATION FOR SEQ ID NO: 100 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:100: 
ACTACTTACT TCTAGTAG 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 
Ser Asp Gin Gin Val Val He Glu 
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(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTIONj_SEQ ID NO: 102: 
CCGAYCRACA KGTCRTRATT G 



(2) INFORMATION FOR SEQ ID NO: 103: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 

Asn Gin Thr Ser Met Thr Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO:104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 
TCARDCTTCT ATGACAGMAC C 



(2) INFORMATION FOR SEQ ID NO: 105 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



WO 98/18932 



122 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Gin Asp Gin Glu Lys lie lie Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 106 : 

(i) SEQUENCE^CHARACTERISTICS-: — — 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
CAAGATCAAG ARAARMTYAT YCCT 



(2) INFORMATION FOR SEQ ID NO: 107 : 

(i) SBQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Ser His Lys Gin Asp Gin Glu 
1 5 



(2). INFORMATION FOR SEQ ID NO:10B: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IB base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108 
CTCRTMAACA AGATCAAG 



(2) INFORMATION FOR SEQ ID NO: 109: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Ser Gly Ser Val Thr Ala_His_ 
1 * 5 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
CTGGAARYGT SACGGCTC 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:lll: 
GCTTAGTATC TACTTTAAAG AG 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 112: 
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GATACTATTT GATAAGTTCT CATC 



(2) INFORMATION FOR SEQ ID NO: 113 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

CTTTTGGCCT TTTTGCGAAA TAAC 



(2) INFORMATION FOR SEQ ID NO:114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 31 baBe pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 114 
CTGGGTTTAA CGCTTTATCA GATTGATATT C 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 115 
ACTTTTATTG CTAAACARGC TGC 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TAACAGCTAC TCTTCCTTTG 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 baBe pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
GGTGACATTT TCCAAGCTAA CATTG 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 
AGAGGACGTG GAAAGTAGA 



(2) INFORMATION FOR SEQ ID NO: 119: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) - STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
GCTGTTGCGG TTGTATCTG 

(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID N 
GATTGAGTTG AACGCTGTTG CGG 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID 
CCAGGCAACA ATCGCACCGT C 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base_pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID 
CAGTACCTGA GTCTCCTTC 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:123 
GTGCTTGTAT TGGTTACGG 
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(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

— < xiT~ SEQUENCE DESCRIPTION: _SEQ ID NO: 

GAAGGATTAT ACGCTAATTG TGAAG 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUBNCE DESCRIPTION: SEQ ID NO: 125: 
GGAACAATTT GAATCCCATT GAATG 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 
CCTCCACCAT TAGATTCCTG CAC 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 7: 
CAATGTAAAT TTCCCTCCTA TC 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
(c) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 8: 
GGTGCCAGCA AAMAGTTTTG AG 



(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 
CTAGAGAGGT TGCTYGATCC 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 0 
CTACTAGAAG TAAGTAGT 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
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(G) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

GGTKCTGTCA TAGAAGHYTG A 



(2) .INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: 
AGGRATRAKY TTYTCTTGAT CTTG 



(2) INFORMATION FOR SEQ ID NO:133: 

(i) SEQUENCE "CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 133 
CTTGATCTTG TTKAYGAG 



(2) INFORMATION FOR SEQ ID NO:134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134 
GAGCCGTSAC RYTTCCAG 
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9 A pcsticidal toxin wherein said toxin can be encoded by a po.ynuc.eotide sequence 
whereinaportion of said polynucleotide sequence can be amplified by PGR utilizing a pnmer 

SEQIDNOS.58and,12,SEQIDNOS.62andn3,SEQIDNOS.62andn4,SEQIDNOS. 
6 2andn5 > SEQIDNOS.62andn6,SEQIDNO,62a„dU7,SEQIDNOS.^dn4 

SEQ ID NOS. 64 and 1 15, SEQ ID NOS. 64 and 1 16, SEQ ID NOS. 64 and 117, SEQ ID NOS. 

74 and 119 SEQ ID NOS. 74 and 120, SEQ ID NOS. 74 and 121, SEQ ID NOS. 74 and 122, 
SEQ ID NOS. 74 and 123, SEQ ID NOS. 74 and 124, SEQ ID NOS. 76 and 1 19. SEQ ID NOS. 
76 a ndl20,SEQIDNOS.76andl21,SEQIDNOS.76andl22,SEQIDNOS.76andl23, 

SEQ ID NOS. 76 and 124, SEQ ID NOS. 78 and 120, SEQ ID NOS. 78 and 121 , SEQ ID NOS. 
78 and 122 SEQ ID NOS. 78 and 123, SEQ ID NOS. 78 and 124. SEQ ID NOS. 80 and 121. 
SEQ ID NOS. 80 and 122. SEQ ID NOS. 80 and 123. SEQ ID NOS. 80 and 124, SEQ ID NOS. 
82 and 122 SEQ ID NOS. 82 and 123, SEQ ID NOS. 82 and 124. SEQ ID NOS. 84 and 123, 
SEQ ID NOS. 84 and 124, SEQ ID NOS. 86 and 124. SEQ ID NOS. 90 and 125. SEQ ID NOS. 
90 and 126. SEQ ID NOS. 90 and 127. SEQ ID NOS 92 and 126. SEQ ID NOS. 92 and 127. 
SEQ ID NOS. 94 and 127. SEQ ID NOS. 97 and 128, SEQ ID NOS. 97 and 129, SEQ ID NOS. 
97 and 130, SEQ ID NOS. 98 and 129, SEQ ID NOS. 98 and 130, SEQ ID NOS. 99 and 130, 
SEQ ID NOS. 102 and 131, SEQ ID NOS. 102 and 132, SEQ ID NOS. 102 and 133.SEQID 
NOS 10 2andl34.SEQIDNOS.104andl32,SEQIDNOS.104andl33,SEQIDNOS.104 

and 134 SEQ ID NOS. 106 and 133, SEQ ID NOS. 106 and 134, SEQ ID NOS. 108 and 134, 



1 
2 

3 112 



, o The toxin, according to claim 9, whemn said primer pair is selected from the group 
consist 1 ngofSEQIDNOS.56andlll.SEQIDNOS.56andll2,andSEQIDNOS.58and 



, i n* toxm, according to claim 9, where* said pnmer pair is selected from the group 
of SEQ ID NOS. 62 and 1 13, SEQ ID NOS. 62 and 1 14, SEQ ID NOS. 62 and 1 15, SEQ ID 
NOS62andll6,SEQIDNOS.62andll7,SEQIDNOS.64andll4,SEQIDNOS.64and 

, ,5 SEQ ID NOS. 64 and 1 16, SEQ ID NOS. 64 and 117, SEQ ID NOS. 66 and 1 15, SEQ ID 
NOS66andll6,SEQIDNOS.66andll7,SEQIDNOS.68andll6,SEQIDNOS.68and 



6 1 17, and SEQ ID NOS. 70 and 1 1 7. 



PCT/US97/19804 

WO 98/18932 

132 



12 The toxin, according to claim 9, wherein said pnmer pair is selected from the group 
of SEQ ID NOS. 74 and 1 18, SEQ ID NOS. 74 and 1 19, SEQ ID NOS. 74 and 120, SEQ ID 
NOS 74 and 12 1, SEQ ID NOS. 74 and 122, SEQ ID NOS. 74 and 123, SEQ ID NOS. 74 and 
,24 SEQ ID NOS. 76 and 119, SEQ ID NOS. 76 and 120, SEQ ID NOS. 76 and 121, SEQ ID 
NOS 76 and 122, SEQ ID NOS. 76 and 123, SEQ ID NOS. 76 and 124, SEQ ID NOS. 78 and 
120 SEQ ID-NOS. 78 and 121, SEQ ID NOS. 78 and 122, SEQ ID NOS. 78 and 123, SEQ ID 
^OS 78 and 124, SEQ ID NOS. 80 and 121, SEQ ID NOS. 80 and ,22, SEQ ID NOS. 80 and 
,23 SEQ ID NOS. 80 and 124, SEQ ID NOS. 82 and 122, SEQ ID NOS. 82 and 123, SEQ ID 
NOS. 82 and 124, SEQ ID NOS. 84 and 123, SEQ ID NOS. 84 and 124, and SEQ ID NOS. 86 
and 124. 

,3 The toxin, accordmg to claim 9, wherein said primer pa,r .s selected from the group 
of SEQ ID NOS. 90 and 125, SEQ ID NOS. 90 and 126, SEQ ID NOS. 90 and 127, SEQ ID 
NOS 92 and 126, SEQ ID NOS. 92 and 127, and SEQ ID NOS. 94 and 127. 

,4 Tne toxm, according to claim 9, wherein said primer pair is selected from the group 
of SEQ ID NOS. 97 and 128, SEQ ID NOS. 97 and 129, SEQ ID NOS. 97 and 130, SEQ ID 
NOS. 98 and 129, SEQ ID NOS. 98 and 130, and SEQ ID NOS. 99 and 130. 

, 5 The toxin, according to claim 9, wherem said pnmer pa,r is selected from the group 
of SEQ ID NOS. 102 and 131. SEQ ID NOS. 102 and 132, SEQ ID NOS. 102 and 133. SEQ ID 
NOS 102andl34,SEQIDNOS. 104 and ,32, SEQ ID NOS. 104 and ,33,SEQIDNOS. ,04 
and 134, SEQ ID NOS. 106 and 133, SEQ ID NOS. 106 and ,34, and SEQ ID NOS. ,08and 
,34. 



1 

2 54. 



,6. Tne toxin, according to claim 9, wherein said pnmer P a,r is SEQ ID NOS. 53 and 



,7 A pesticida, toxin which is .mmunoreacnve to antibodies raised to a toxin from 
PS177C8a, wherein sa,d PS177C8a toxin is encoded by a P o,ynucleonde sequence which 



3 compnsesSEQIDNO.51. 



1 
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1 8. The toxin, according to claim 17, wherein no portion of the gene encoding said toxin 



2 is amplified by SEQ ID NOS. 49 and 50. 



1 

2 

3 

1 

2 
3 
4 



7 



19. The toxin, according to claim 1 7, wherein said toxin can be obtained from an isolate 
selected from the group consisting of PS177C8a, PS177I8, PS66D3, KB68B55-2, PS185Y2, 
PS146F, KB53A49-4, PS175I4, KB68B51-2, PS28K1, PS31F2, KB58B46-2, and PS146D. 

20. A polynucleotide sequence which encodes a pesticidal toxin from-a Bacillus 
thuringiensis isolate selected from the group cons.sting of PS10E1, PS31F2, PS33D2, PS66D3, 
PS68F, PS69AA2, PS146D, PS168G1, PS175I4, PS177C8a, PS177I8, PS185AA2, PS1966J4. 
PS196F3, PS197T1, PS197U2, PS202E1, PS217U2, KB33, KB38, KB53A49-4, KB68B46-2, 



5 KB68B51-2,andKB68B55-2. 



1 2 1 . A polynucleotide sequence encoding a pesticidal toxin wherein said toxin can be 

2 encoded by a polynucleotide sequence which hybndizes with a sequence elected from the group 

3 consisting of SEQ ED NOS. 18. 20, 22, 24, 26, 28, 30, 3 1 , 33, 35, 37, 39, 41 , 43, 45, 47, 48, and 

4 fragments thereof, wherein said fragments are at least about 1 0 bases. 

1 22. The polynucleotide sequence, according to claim 2 1 , wherein said fragment is at 

2 least about 100 bases. 



1 23. A polynucleotide sequence wherein said sequence encodes a pesticidal toxin 

2 belonging to a family selected from the group consisting of MIS-1, M1S-2, MIS-3, MIS-4, M1S- 

3 5,MlS-6,andSUP-l. 



1 24. A polynucleotide sequence encoding a pesticidal toxin, wherein a portion of said 

2 polynucleotide sequence can be amplified by PCR utilizing a primer pair selected from the group 

3 consisting of SEQ ID NOS. 56 and 111, SEQ ID NOS. 56 and 1 12, SEQ ID NOS. 58 and 1 12, 

4 SEQ ID NOS. 62 and 1 13, SEQ ID NOS. 62 and 1 14, SEQ ID NOS. 62 and 1 15, SEQ ID NOS. 

5 62 and 1 16, SEQ ID NOS. 62 and 1 17, SEQ ID NOS. 64 and 1 14, SEQ ID NOS. 64 and 1 15, 

6 SEQ ID NOS. 64 and 1 16, SEQ ID NOS. 64 and 1 17, SEQ ID NOS. 66 and 1 15, SEQ ID NOS. 
66 and 116, SEQ ID NOS. 66 and 117, SEQ ID NOS. 68 and 116, SEQ ID NOS. 68 and 117. 

8 SEQ ID NOS. 70 and 1 17, SEQ ID NOS. 74 and 118, SEQ ID NOS. 74 and 1 19, SEQ ID NOS. 

9 74 and 120. SEQ ID NOS. 74 and 121, SEQ ID NOS. 74 and 122, SEQ ID NOS. 74 and 123, 



10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 
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SEQ ID NOS. 74 and 124, SEQ ID NOS. 76 and 1 19, SEQ ID NOS. 76 and 120, SEQ ID NOS. 
76 and 121, SEQ ID NOS. 76 and 122, SEQ ID NOS. 76 and 123, SEQ ID NOS. 76 and 124, 
SEQ ID NOS. 78 and 120, SEQ ID NOS. 78 and 121, SEQID NOS. 78 and 122, SEQ ID NOS. 
78 and 123, SEQ ID NOS. 78 and 124, SEQ ID NOS. 80 and 121, SEQ ID NOS. 80 and 122, 
SEQ ID NOS. 80 and 123, SEQ ID NOS. 80 and 124, SEQ ID NOS. 82 and 122, SEQ ID NOS. 
" 82 and 123, SEQ ID NOS. 82 and 124, SEQ ID NOS. 84 and 123, SEQ ID NOS. 84 and 124, 
SEQ ID NOS. 86 and 124, SEQ ID NOS. 90 and 125, SEQ ID NOS. 90 and 126, SEQ IDJNOS. 
90 and 127. SEQ ID NOS 92™d 126, SEQ ID NOS. 92 and 127. SEQ ID NOS. 94 and 127, 
SEQ ID NOS. 97 and 128. SEQ IDNOS. 97 and 129. SEQ ID NOS. 97 and 130, SEQ ID NOS. 
98 and 129, SEQ ID NOS. 98 and 130, SEQ ID NOS. 99 and 130, SEQ ID NOS. 102 and 131, 
SEQ ID NOS. 102 and 132, SEQ ID NOS.J02 and 133, SEQ ID NOS. 102 and 134, SEQ ID 
NOS. 104 and 132, SEQ ID NOS. 104 and 133, SEQ ID NOS. 104 and 1 34, SEQ ID NOS. 106 
and 133, SEQ ID NOS. 106 and 134, SEQ ID NOS. 108 and 134. and SEQ ID NOS. 53 and 54. 



j 25. The polynucleotide sequence, according to claim 24, wherein a portion of said 

2 sequence can be amplified with a primer pair selected from the group consisting of SEQ ID 

3 NOS. 56 and 1 1 1 . SEQ ID NOS. 56 and 1 12, and SEQ ED NOS. 58 and 112. 

, 26. The polynucleotide sequence, according to claim 25, wherein sa.d sequence 

2 hybridizes with SEQ ID NO. 26. 



1 
2 
3 
4 
5 
6 

7 and 117 



27. The polynucleotide sequence, according to clatm 24, wherein a portion of said 
sequence can be amplified from a primer pair selected from the group of SEQ ID NOS. 62 and 
1 13. SEQ ED NOS. 62 and 1 14, SEQ ID NOS. 62 and 115. SEQ ID NOS. 62 and 1 16. SEQ ED 
NOS. 62 and 1 17, SEQ ED NOS. 64 and 1 14, SEQ ED NOS. 64 and 1 15, SEQ ED NOS. 64 and 
1 16, SEQ ID NOS. 64 and 1 17, SEQ ID NOS. 66 and 1 15, SEQ ID NOS. 66 and 1 16, SEQ ID 
NOS. 66 and 1 17, SEQ ID NOS. 68 and 1 16, SEQ ED NOS. 68 and 1 17, and SEQ ED NOS. 70 



1 28. The polynucleotide sequence, according to claim 27. wherein said sequence 

2 hybridizes with a probe selected from the group consisting of SEQ ID NOS. 20. 24. and 4 1 . 

! 29. The polynucleotide sequence, according to claim 24. wherein a portion of said 

2 sequence can be amplified from a pnmer pair selected from the group of SEQ ID NOS. 74 and 
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118 SEQ ©NOS. 74 and 1 19, SEQ ID NOS. 74 and 120, SEQ ID NOS. 74 and 121. SEQ ID 

119 SEQ ID NOS. 76 and 120, SEQ ID NOS. 76 and 121 , SEQ ID NOS. 76 and 122, SEQ ID 
Nil.-«.«IDN*76-lH«»i~7.-.*«BQID^- 
121 SEQ ID NOS. 78 and 122, SEQ ID NOS. 78 and 123, SEQ ID NOS. 78 and 124, SEQID 

,24 SEQID NOS. 82 and 122, SEQ ID NOS. 82 and 123, SEQ ID NOS. 82 andlH SEQ ID 
NOS. 84 and 123, SEQ ID NOsTm and 124, and SEQ ID NOS. 86-and 124. 

30 The polynucleotide sequence, according to clann 29, wherein said sequence 
hybridizes withaprobe selected from the group consisting of SEQ ID NOS. 28 and 22. 

3 1 The polynucleotide sequence, accord.ng to claim 24. wherem a portion of said 

125 , SEQ ID NOS. 90 and 126, SEQ ID NOS. 90 and 127, SEQ ID NOS 92 and 126, SEQ ID 
NOS. 92 and 127, and SEQ ID NOS. 94 and 127. 

32 The polynucleotide sequence, according to claim 24. where* sa.d sequence 
hybridizes with a probe selectedjrom the group con Sl stin g of SEQ ID NO. 43. 

33 The polynucleotide sequence, according to claun 24, wherein a portion of said 

sequence can be amplified from a primer pair selected from the group of SEQ ID NOS. 
i r 8 ,SEQIDNO,97andl29,SEQIDNO,97andl30,SEQIDNO,98andl29, S EQID 

NOS. 98 and 130, and SEQ ED NOS. 99 and 130. 

34 The polynucleotide sequenceTaccording to clann 33, wherein said sequence 

35 The polynucleotide sequence, according to claim 24, wherein a portion of said 

131 SEQ ID NOS. 102 and 132, SEQ ID NOS. 102 and 133, SEQ ID NOS. 102 and 134, SEQ 
!D NOS 104 and 132. SEQ ID NOS. 104 and 133. SEQ ID NOS. 104 and 134, SEQ ID NOS. 
106 and 133, SEQ ID NOS. 106 and 134, and SEQ ID NOS. 108 and 134. 
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36 The polynucleotide sequence, according to claun 35, wherein said sequence 
hybridizes with a probe selected from the group consisting of SEQ ID NOS. 1 8, 30, 35, 37, 39, 
and 45. 

37. The polynucleotide sequence, according to claim 24, wherein a portion of sa>d 
sequence can be amplified from pnmer pair SEQ ID NOS. 53 and 54. 

38 "The polynucleotide sequence, according to claim 37, wherein said sequence 
hybridizes with a probe se.ected from the group consisting of SEQ ID NOS. 10, 12, and 15. 

39. The polynucleotide sequence, according to claim 23, wherein said sequence is 
optimized for expression in plants. 

40 Apolynucleoudesequenceencodingapcsticidaltoxm which is immunoreactivc 
to anubod.es nused to a toxin from PS177C8a, wherein said PS177C8a toxin is encoded by a 
polynucleotide sequence which comprises SEQ ID NO. 5 1 . 

4 1 . ^polynucleotide sequence, according to claim 40, wherein no portion of the gene 
encoding said toxin is amplified by SEQ ID NOS. 49 and 50. 

42 Tne polynucleotide sequence, according to claim 40, wherein saki toxin can be 
KB68B55-2. PS185Y2, PS146F, D53MH KB68B51-2, PS28K1, PS31F2, 



1 

2 
3 

4 KB58B46-2,andPS146D. 



1 
2 
3 

4 55-134. 



1 

2 



43 A polynucleotide sequence useful as a PCR primer or a hybridization probe, 
wherein said polynucleotide sequence is selected from the group consisting of SEQ ID NOS. 3, 
5 7 10 12,15,18,20,22,24,26,28,30,31,33,35,37,39.41,43,45,47,48,51,53,54,^ 



44 A polynucleotide sequence comprising a sequence selected from the group 



3 43, 45, 47, 48, 51, 53, 54, and 55-134. 
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45. A method for forming a pore in a cell membrane, wherein said method comprises 
contacting said cell membrane with a toxin belonging to a family selected from the group 
consisting of MIS-1, MIS-2, MIS-3, MIS-5, MIS-5, MIS-6, and SUP-1 . 

46. A method for controlling a non-mammalian pest, wherein said method comprises 
contacting said pest with a toxin belonging to a family selected from the group consisting of 
MIS-1, MIS-2, MIS-3, MIS-^MIS-5, MIS-6, and SUP-1 . 



, 47. A transformed host comprising a polynucleotide sequence encoding a pesticidal 

2 toxin belonging to a family selected from the group consisting of MIS-1, MIS-2, MIS-3, MIS-4, 

3 MIS-5, MIS-5, and SUP-1. 

1 48. The transformed host, according to claim 47, wherein said host is a plant. 

I 49. The transformed host, according to claim 47. wherein said host is a bacterium. 



50. A transformed host comprising a heterologous polynucleotide encoding a pesticidal 
toxin wherein said toxin can be encoded by a polynucleotide sequence wherein a portion of said 
polynucleotide sequence can be amplified by a primer pair selected from the group coining 
of SEQ ID NOS. 56 and 1 1 1, SEQ ID NOS. 56 and 1 12, SEQ ID NOS. 58 and 1 12, SEQ ID 
NOS. 62 and 1 13, SEQ ID NOS. 62 and 1 14, SEQ ID NOS. 62 and 115, SEQ ID NOS. 62 and 
116, SEQ ID NOS. 62 and 1 17, SEQ ID NOS. 64 and 1 14, SEQ ID NOS. 64 and 115, SEQ ID 
NOS. 64 and 116, SEQ ID NOS. 64 and 1 17, SEQ ID NOS. 66 and 1 15, SEQ ID NOS. 66 and 
1 16, SEQ ID NOS. 66 and 1 17, SEQ ID NOS. 68 and 116, SEQ ID NOS. 68 and 1 17, SEQ ID 
NOS. 70 and 117, SEQ ID NOS. 74 and 1 18, SEQ ID NOS. 74 and 119, SEQ ID NOS. 74 and 

120, SEQ ID NOS. 74 and 121, SEQ ID NOS. 74 and 122, SEQ ID NOS. 74 and 123, SEQ ID 
NOS. 74 and 124, SEQ ID NOS. 76 and 1 19, SEQ ID NOS. 76 and 120, SEQ ID NOS. 76 and 

121 , SEQ ID NOS. 76 and 122, SEQ ID NOS. 76 and 123, SEQ ID NOS. 76 and 124, SEQ ID 
NOS. 78 and 120, SEQ ID NOS. 78 and 121, SEQ ID NOS. 78 and 122. SEQ ID NOS. 78 and 
123, SEQ ID NOS. 78 and 124, SEQ ID NOS. 80 and 121 , SEQ ID NOS. 80 and 122, SEQ ID 
NOS. 80 and 123, SEQ ID NOS. 80 and 124, SEQ ID NOS. 82 and 122, SEQ ID NOS. 82 and 
123, SEQ ID NOS. 82 and 124, SEQ ID NOS. 84 and 123, SEQ ID NOS. 84 and 124, SEQ ID 
NOS. 86 and 124, SEQ ID NOS. 90 and 125, SEQ ID NOS. 90 and 126, SEQ ID NOS. 90 and 

18 127. SEQ ID NOS 92 and 126, SEQ ID NOS. 92 and 127. SEQ ID NOS. 94 and 127, SEQ ID 
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NOS. 97 and 128, SEQ ID NOS. 97 and 129, SEQ ID NOS. 97 and 130, SEQ ID NOS. 98 and 
1 29, SEQ ID NOS. 98 and 1 30, SEQ ID NOS. 99 and 1 30, SEQ ID NOS. 1 02 and 1 3 1 , SEQ ID 
NOS. 102 and 132, SEQ ID NOS. 102 and 133, SEQ ID NOS. 102 and 134, SEQ ID NOS. 104 
and 132, SEQ ID NOS. 104 and 133, SEQ ID NOS. 104 and 134, SEQ ID NOS. 106 and 133, 
SEQ ID NOS. 106 and 134, SEQ ID NOS. 108 and 134, and SEQ ID NOS. 53 and 54. 

51. The transformed host, according to claim 50, wherein said host is aplant. 

52. The transformed host, according to claim 50, wherein said host is a bacterium. 
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